Class TestUtil


  • public final class TestUtil
    extends Object
    General utility methods for Lucene unit tests.
    • Field Detail

      • STRING_CODEPOINT_COMPARATOR

        public static final Comparator<CharSequence> STRING_CODEPOINT_COMPARATOR
        A comparator that compares UTF-16 strings / char sequences according to Unicode code point order. This can be used to verify BytesRef order.

        Warning: This comparator is rather inefficient, because it converts the strings to a int[] array on each invocation.

    • Method Detail

      • unzip

        public static void unzip​(InputStream in,
                                 Path destDir)
                          throws IOException
        Convenience method unzipping zipName into destDir. You must pass it a clean destDir.

        Closes the given InputStream after extracting!

        Throws:
        IOException
      • checkIterator

        public static <T> void checkIterator​(Iterator<T> iterator,
                                             long expectedSize,
                                             boolean allowNull)
        Checks that the provided iterator is well-formed.
        • is read-only: does not allow remove
        • returns expectedSize number of elements
        • does not return null elements, unless allowNull is true.
        • throws NoSuchElementException if next is called after hasNext returns false.
      • checkIterator

        public static <T> void checkIterator​(Iterator<T> iterator)
        Checks that the provided iterator is well-formed.
        • is read-only: does not allow remove
        • does not return null elements.
        • throws NoSuchElementException if next is called after hasNext returns false.
      • syncConcurrentMerges

        public static void syncConcurrentMerges​(IndexWriter writer)
      • syncConcurrentMerges

        public static void syncConcurrentMerges​(MergeScheduler ms)
      • checkIndex

        public static CheckIndex.Status checkIndex​(Directory dir,
                                                   boolean doSlowChecks,
                                                   boolean failFast,
                                                   boolean concurrent,
                                                   ByteArrayOutputStream output)
                                            throws IOException
        If failFast is true, then throw the first exception when index corruption is hit, instead of moving on to other fields/segments to look for any other corruption.
        Throws:
        IOException
      • checkReader

        public static void checkReader​(IndexReader reader)
                                throws IOException
        This runs the CheckIndex tool on the Reader. If any issues are hit, a RuntimeException is thrown
        Throws:
        IOException
      • nextInt

        public static int nextInt​(Random r,
                                  int start,
                                  int end)
        start and end are BOTH inclusive
      • nextLong

        public static long nextLong​(Random r,
                                    long start,
                                    long end)
        start and end are BOTH inclusive
      • nextBigInteger

        public static BigInteger nextBigInteger​(Random random,
                                                int maxBytes)
        Returns a randomish big integer with 1 .. maxBytes storage.
      • randomSimpleString

        public static String randomSimpleString​(Random r,
                                                int maxLength)
      • randomSimpleString

        public static String randomSimpleString​(Random r,
                                                int minLength,
                                                int maxLength)
      • randomSimpleStringRange

        public static String randomSimpleStringRange​(Random r,
                                                     char minChar,
                                                     char maxChar,
                                                     int maxLength)
      • randomSimpleString

        public static String randomSimpleString​(Random r)
      • randomUnicodeString

        public static String randomUnicodeString​(Random r)
        Returns random string, including full unicode range.
      • randomUnicodeString

        public static String randomUnicodeString​(Random r,
                                                 int maxLength)
        Returns a random string up to a certain length.
      • randomFixedLengthUnicodeString

        public static void randomFixedLengthUnicodeString​(Random random,
                                                          char[] chars,
                                                          int offset,
                                                          int length)
        Fills provided char[] with valid random unicode code unit sequence.
      • randomRegexpishString

        public static String randomRegexpishString​(Random r)
        Returns a String thats "regexpish" (contains lots of operators typically found in regular expressions) If you call this enough times, you might get a valid regex!
      • randomRegexpishString

        public static String randomRegexpishString​(Random r,
                                                   int maxLength)
        Returns a String thats "regexpish" (contains lots of operators typically found in regular expressions) If you call this enough times, you might get a valid regex!

        Note: to avoid practically endless backtracking patterns we replace asterisk and plus operators with bounded repetitions. See LUCENE-4111 for more info.

        Parameters:
        maxLength - A hint about maximum length of the regexpish string. It may be exceeded by a few characters.
      • randomHtmlishString

        public static String randomHtmlishString​(Random random,
                                                 int numElements)
      • randomlyRecaseCodePoints

        public static String randomlyRecaseCodePoints​(Random random,
                                                      String str)
        Randomly upcases, downcases, or leaves intact each code point in the given string
      • randomRealisticUnicodeString

        public static String randomRealisticUnicodeString​(Random r)
        Returns random string of length between 0-20 codepoints, all codepoints within the same unicode block.
      • randomRealisticUnicodeString

        public static String randomRealisticUnicodeString​(Random r,
                                                          int maxLength)
        Returns random string of length up to maxLength codepoints , all codepoints within the same unicode block.
      • randomRealisticUnicodeString

        public static String randomRealisticUnicodeString​(Random r,
                                                          int minLength,
                                                          int maxLength)
        Returns random string of length between min and max codepoints, all codepoints within the same unicode block.
      • randomFixedByteLengthUnicodeString

        public static String randomFixedByteLengthUnicodeString​(Random r,
                                                                int length)
        Returns random string, with a given UTF-8 byte length
      • randomBinaryTerm

        public static BytesRef randomBinaryTerm​(Random r)
        Returns a random binary term.
      • randomBinaryTerm

        public static BytesRef randomBinaryTerm​(Random r,
                                                int length)
        Returns a random binary with a given length
      • alwaysPostingsFormat

        public static Codec alwaysPostingsFormat​(PostingsFormat format)
        Return a Codec that can read any of the default codecs and formats, but always writes in the specified format.
      • alwaysDocValuesFormat

        public static Codec alwaysDocValuesFormat​(DocValuesFormat format)
        Return a Codec that can read any of the default codecs and formats, but always writes in the specified format.
      • getDefaultCodec

        public static Codec getDefaultCodec()
        Returns the actual default codec (e.g. LuceneMNCodec) for this version of Lucene. This may be different than Codec.getDefault() because that is randomized.
      • getDefaultPostingsFormat

        public static PostingsFormat getDefaultPostingsFormat()
        Returns the actual default postings format (e.g. LuceneMNPostingsFormat for this version of Lucene.
      • getDefaultPostingsFormat

        public static PostingsFormat getDefaultPostingsFormat​(int minItemsPerBlock,
                                                              int maxItemsPerBlock)
        Returns the actual default postings format (e.g. LuceneMNPostingsFormat for this version of Lucene.
        NOTE: This API is for internal purposes only and might change in incompatible ways in the next release.
        this may disappear at any time
      • getPostingsFormatWithOrds

        public static PostingsFormat getPostingsFormatWithOrds​(Random r)
        Returns a random postings format that supports term ordinals
      • getDefaultDocValuesFormat

        public static DocValuesFormat getDefaultDocValuesFormat()
        Returns the actual default docvalues format (e.g. LuceneMNDocValuesFormat for this version of Lucene.
      • getPostingsFormat

        public static String getPostingsFormat​(String field)
      • getPostingsFormat

        public static String getPostingsFormat​(Codec codec,
                                               String field)
      • getDocValuesFormat

        public static String getDocValuesFormat​(String field)
      • getDocValuesFormat

        public static String getDocValuesFormat​(Codec codec,
                                                String field)
      • fieldSupportsHugeBinaryDocValues

        public static boolean fieldSupportsHugeBinaryDocValues​(String field)
      • getDefaultKnnVectorsFormat

        public static KnnVectorsFormat getDefaultKnnVectorsFormat()
        Returns the actual default vector format (e.g. LuceneMNKnnVectorsFormat for this version of Lucene.
      • reduceOpenFiles

        public static void reduceOpenFiles​(IndexWriter w)
        just tries to configure things to keep the open file count lowish
      • assertAttributeReflection

        public static <T> void assertAttributeReflection​(AttributeImpl att,
                                                         Map<String,​T> reflectedValues)
        Checks some basic behaviour of an AttributeImpl
        Parameters:
        reflectedValues - contains a map with "AttributeClass#key" as values
      • assertConsistent

        public static void assertConsistent​(TopDocs expected,
                                            TopDocs actual)
        Assert that the given TopDocs have the same top docs and consistent hit counts.
      • randomPattern

        public static Pattern randomPattern​(Random random)
        Returns a valid (compiling) Pattern instance with random stuff inside. Be careful when applying random patterns to longer strings as certain types of patterns may explode into exponential times in backtracking implementations (such as Java's).
      • randomAnalysisString

        public static String randomAnalysisString​(Random random,
                                                  int maxLength,
                                                  boolean simple)
      • randomSubString

        public static String randomSubString​(Random random,
                                             int wordLength,
                                             boolean simple)
      • bytesRefToString

        public static String bytesRefToString​(BytesRef br)
        For debugging: tries to include br.utf8ToString(), but if that fails (because it's not valid utf8, which is fine!), just use ordinary toString.
      • hasWindowsFS

        public static boolean hasWindowsFS​(Directory dir)
      • hasWindowsFS

        public static boolean hasWindowsFS​(Path path)
      • hasVirusChecker

        public static boolean hasVirusChecker​(Directory dir)
      • hasVirusChecker

        public static boolean hasVirusChecker​(Path path)
      • disableVirusChecker

        public static boolean disableVirusChecker​(Directory in)
        Returns true if VirusCheckingFS is in use and was in fact already enabled
      • enableVirusChecker

        public static void enableVirusChecker​(Directory in)