Class TestUtil

java.lang.Object
org.apache.lucene.tests.util.TestUtil

public final class TestUtil extends Object
General utility methods for Lucene unit tests.
  • Field Details

    • STRING_CODEPOINT_COMPARATOR

      public static final Comparator<CharSequence> STRING_CODEPOINT_COMPARATOR
      A comparator that compares UTF-16 strings / char sequences according to Unicode code point order. This can be used to verify BytesRef order.

      Warning: This comparator is rather inefficient, because it converts the strings to a int[] array on each invocation.

  • Method Details

    • unzip

      public static void unzip(InputStream in, Path destDir) throws IOException
      Convenience method unzipping zipName into destDir. You must pass it a clean destDir.

      Closes the given InputStream after extracting!

      Throws:
      IOException
    • checkIterator

      public static <T> void checkIterator(Iterator<T> iterator, long expectedSize, boolean allowNull)
      Checks that the provided iterator is well-formed.
      • is read-only: does not allow remove
      • returns expectedSize number of elements
      • does not return null elements, unless allowNull is true.
      • throws NoSuchElementException if next is called after hasNext returns false.
    • checkIterator

      public static <T> void checkIterator(Iterator<T> iterator)
      Checks that the provided iterator is well-formed.
      • is read-only: does not allow remove
      • does not return null elements.
      • throws NoSuchElementException if next is called after hasNext returns false.
    • checkReadOnly

      public static <T> void checkReadOnly(Collection<T> coll)
      Checks that the provided collection is read-only.
      See Also:
    • syncConcurrentMerges

      public static void syncConcurrentMerges(IndexWriter writer)
    • syncConcurrentMerges

      public static void syncConcurrentMerges(MergeScheduler ms)
    • checkIndex

      public static CheckIndex.Status checkIndex(Directory dir) throws IOException
      This runs the CheckIndex tool on the index in. If any issues are hit, a RuntimeException is thrown; else, true is returned.
      Throws:
      IOException
    • checkIndex

      public static CheckIndex.Status checkIndex(Directory dir, boolean doSlowChecks) throws IOException
      Throws:
      IOException
    • checkIndex

      public static CheckIndex.Status checkIndex(Directory dir, boolean doSlowChecks, boolean failFast, boolean concurrent, ByteArrayOutputStream output) throws IOException
      If failFast is true, then throw the first exception when index corruption is hit, instead of moving on to other fields/segments to look for any other corruption.
      Throws:
      IOException
    • checkReader

      public static void checkReader(IndexReader reader) throws IOException
      This runs the CheckIndex tool on the Reader. If any issues are hit, a RuntimeException is thrown
      Throws:
      IOException
    • checkReader

      public static void checkReader(LeafReader reader, boolean doSlowChecks) throws IOException
      Throws:
      IOException
    • nextInt

      public static int nextInt(Random r, int start, int end)
      start and end are BOTH inclusive
    • nextLong

      public static long nextLong(Random r, long start, long end)
      start and end are BOTH inclusive
    • nextBigInteger

      public static BigInteger nextBigInteger(Random random, int maxBytes)
      Returns a randomish big integer with 1 .. maxBytes storage.
    • randomSimpleString

      public static String randomSimpleString(Random r, int maxLength)
    • randomSimpleString

      public static String randomSimpleString(Random r, int minLength, int maxLength)
    • randomSimpleStringRange

      public static String randomSimpleStringRange(Random r, char minChar, char maxChar, int maxLength)
    • randomSimpleString

      public static String randomSimpleString(Random r)
    • randomUnicodeString

      public static String randomUnicodeString(Random r)
      Returns random string, including full unicode range.
    • randomUnicodeString

      public static String randomUnicodeString(Random r, int maxLength)
      Returns a random string up to a certain length.
    • randomFixedLengthUnicodeString

      public static void randomFixedLengthUnicodeString(Random random, char[] chars, int offset, int length)
      Fills provided char[] with valid random unicode code unit sequence.
    • randomRegexpishString

      public static String randomRegexpishString(Random r)
      Returns a String thats "regexpish" (contains lots of operators typically found in regular expressions) If you call this enough times, you might get a valid regex!
    • randomRegexpishString

      public static String randomRegexpishString(Random r, int maxLength)
      Returns a String thats "regexpish" (contains lots of operators typically found in regular expressions) If you call this enough times, you might get a valid regex!

      Note: to avoid practically endless backtracking patterns we replace asterisk and plus operators with bounded repetitions. See LUCENE-4111 for more info.

      Parameters:
      maxLength - A hint about maximum length of the regexpish string. It may be exceeded by a few characters.
    • randomHtmlishString

      public static String randomHtmlishString(Random random, int numElements)
    • randomlyRecaseCodePoints

      public static String randomlyRecaseCodePoints(Random random, String str)
      Randomly upcases, downcases, or leaves intact each code point in the given string
    • randomRealisticUnicodeString

      public static String randomRealisticUnicodeString(Random r)
      Returns random string of length between 0-20 codepoints, all codepoints within the same unicode block.
    • randomRealisticUnicodeString

      public static String randomRealisticUnicodeString(Random r, int maxLength)
      Returns random string of length up to maxLength codepoints , all codepoints within the same unicode block.
    • randomRealisticUnicodeString

      public static String randomRealisticUnicodeString(Random r, int minLength, int maxLength)
      Returns random string of length between min and max codepoints, all codepoints within the same unicode block.
    • randomFixedByteLengthUnicodeString

      public static String randomFixedByteLengthUnicodeString(Random r, int length)
      Returns random string, with a given UTF-8 byte length
    • randomBinaryTerm

      public static BytesRef randomBinaryTerm(Random r)
      Returns a random binary term.
    • randomBinaryTerm

      public static BytesRef randomBinaryTerm(Random r, int length)
      Returns a random binary with a given length
    • alwaysPostingsFormat

      public static Codec alwaysPostingsFormat(PostingsFormat format)
      Return a Codec that can read any of the default codecs and formats, but always writes in the specified format.
    • alwaysDocValuesFormat

      public static Codec alwaysDocValuesFormat(DocValuesFormat format)
      Return a Codec that can read any of the default codecs and formats, but always writes in the specified format.
    • getDefaultCodec

      public static Codec getDefaultCodec()
      Returns the actual default codec (e.g. LuceneMNCodec) for this version of Lucene. This may be different than Codec.getDefault() because that is randomized.
    • getDefaultPostingsFormat

      public static PostingsFormat getDefaultPostingsFormat()
      Returns the actual default postings format (e.g. LuceneMNPostingsFormat for this version of Lucene.
    • getDefaultPostingsFormat

      public static PostingsFormat getDefaultPostingsFormat(int minItemsPerBlock, int maxItemsPerBlock)
      Returns the actual default postings format (e.g. LuceneMNPostingsFormat for this version of Lucene.
      NOTE: This API is for internal purposes only and might change in incompatible ways in the next release.
      this may disappear at any time
    • getPostingsFormatWithOrds

      public static PostingsFormat getPostingsFormatWithOrds(Random r)
      Returns a random postings format that supports term ordinals
    • getDefaultDocValuesFormat

      public static DocValuesFormat getDefaultDocValuesFormat()
      Returns the actual default docvalues format (e.g. LuceneMNDocValuesFormat for this version of Lucene.
    • getPostingsFormat

      public static String getPostingsFormat(String field)
    • getPostingsFormat

      public static String getPostingsFormat(Codec codec, String field)
    • getDocValuesFormat

      public static String getDocValuesFormat(String field)
    • getDocValuesFormat

      public static String getDocValuesFormat(Codec codec, String field)
    • fieldSupportsHugeBinaryDocValues

      public static boolean fieldSupportsHugeBinaryDocValues(String field)
    • getDefaultKnnVectorsFormat

      public static KnnVectorsFormat getDefaultKnnVectorsFormat()
      Returns the actual default vector format (e.g. LuceneMNKnnVectorsFormat for this version of Lucene.
    • anyFilesExceptWriteLock

      public static boolean anyFilesExceptWriteLock(Directory dir) throws IOException
      Throws:
      IOException
    • addIndexesSlowly

      public static void addIndexesSlowly(IndexWriter writer, DirectoryReader... readers) throws IOException
      Throws:
      IOException
    • reduceOpenFiles

      public static void reduceOpenFiles(IndexWriter w)
      just tries to configure things to keep the open file count lowish
    • assertAttributeReflection

      public static <T> void assertAttributeReflection(AttributeImpl att, Map<String,T> reflectedValues)
      Checks some basic behaviour of an AttributeImpl
      Parameters:
      reflectedValues - contains a map with "AttributeClass#key" as values
    • assertConsistent

      public static void assertConsistent(TopDocs expected, TopDocs actual)
      Assert that the given TopDocs have the same top docs and consistent hit counts.
    • cloneDocument

      public static Document cloneDocument(Document doc1)
    • docs

      public static PostingsEnum docs(Random random, IndexReader r, String field, BytesRef term, PostingsEnum reuse, int flags) throws IOException
      Throws:
      IOException
    • docs

      public static PostingsEnum docs(Random random, TermsEnum termsEnum, PostingsEnum reuse, int flags) throws IOException
      Throws:
      IOException
    • stringToCharSequence

      public static CharSequence stringToCharSequence(String string, Random random)
    • bytesToCharSequence

      public static CharSequence bytesToCharSequence(BytesRef ref, Random random)
    • shutdownExecutorService

      public static void shutdownExecutorService(ExecutorService ex)
      Shutdown ExecutorService and wait for its.
    • randomPattern

      public static Pattern randomPattern(Random random)
      Returns a valid (compiling) Pattern instance with random stuff inside. Be careful when applying random patterns to longer strings as certain types of patterns may explode into exponential times in backtracking implementations (such as Java's).
    • randomAnalysisString

      public static String randomAnalysisString(Random random, int maxLength, boolean simple)
    • randomSubString

      public static String randomSubString(Random random, int wordLength, boolean simple)
    • bytesRefToString

      public static String bytesRefToString(BytesRef br)
      For debugging: tries to include br.utf8ToString(), but if that fails (because it's not valid utf8, which is fine!), just use ordinary toString.
    • ramCopyOf

      public static Directory ramCopyOf(Directory dir) throws IOException
      Returns a copy of the source directory, with file contents stored in RAM.
      Throws:
      IOException
    • hasWindowsFS

      public static boolean hasWindowsFS(Directory dir)
    • hasWindowsFS

      public static boolean hasWindowsFS(Path path)
    • hasVirusChecker

      public static boolean hasVirusChecker(Directory dir)
    • hasVirusChecker

      public static boolean hasVirusChecker(Path path)
    • disableVirusChecker

      public static boolean disableVirusChecker(Directory in)
      Returns true if VirusCheckingFS is in use and was in fact already enabled
    • enableVirusChecker

      public static void enableVirusChecker(Directory in)