Class ParallelLeafReader

All Implemented Interfaces:
Closeable, AutoCloseable

public class ParallelLeafReader extends LeafReader
An LeafReader which reads multiple, parallel indexes. Each index added must have the same number of documents, but typically each contains different fields. Deletions are taken from the first reader. Each document contains the union of the fields of all documents with the same document number. When searching, matches for a query term are from the first index added that has the field.

This is useful, e.g., with collections that have large fields which change rarely and small fields that change more frequently. The smaller fields may be re-indexed in a new index and both indexes may be searched together.

Warning: It is up to you to make sure all indexes are created and modified the same way. For example, if you add documents to one index, you need to add the same documents in the same order to the other indexes. Failure to do so will result in undefined behavior.

  • Constructor Details

    • ParallelLeafReader

      public ParallelLeafReader(LeafReader... readers) throws IOException
      Create a ParallelLeafReader based on the provided readers; auto-closes the given readers on IndexReader.close().
      Throws:
      IOException
    • ParallelLeafReader

      public ParallelLeafReader(boolean closeSubReaders, LeafReader... readers) throws IOException
      Create a ParallelLeafReader based on the provided readers.
      Throws:
      IOException
    • ParallelLeafReader

      public ParallelLeafReader(boolean closeSubReaders, LeafReader[] readers, LeafReader[] storedFieldsReaders) throws IOException
      Expert: create a ParallelLeafReader based on the provided readers and storedFieldReaders; when a document is loaded, only storedFieldsReaders will be used.
      Throws:
      IOException
  • Method Details

    • toString

      public String toString()
      Overrides:
      toString in class Object
    • getFieldInfos

      public FieldInfos getFieldInfos()
      Get the FieldInfos describing all fields in this reader.

      Note: Implementations should cache the FieldInfos instance returned by this method such that subsequent calls to this method return the same instance.

      NOTE: the returned field numbers will likely not correspond to the actual field numbers in the underlying readers, and codec metadata (FieldInfo.getAttribute(String) will be unavailable.

      Specified by:
      getFieldInfos in class LeafReader
    • getLiveDocs

      public Bits getLiveDocs()
      Description copied from class: LeafReader
      Returns the Bits representing live (not deleted) docs. A set bit indicates the doc ID has not been deleted. If this method returns null it means there are no deleted documents (all documents are live).

      The returned instance has been safely published for use by multiple threads without additional synchronization.

      Specified by:
      getLiveDocs in class LeafReader
    • terms

      public Terms terms(String field) throws IOException
      Description copied from class: LeafReader
      Returns the Terms index for this field, or null if it has none.
      Specified by:
      terms in class LeafReader
      Throws:
      IOException
    • numDocs

      public int numDocs()
      Description copied from class: IndexReader
      Returns the number of documents in this index.

      NOTE: This operation may run in O(maxDoc). Implementations that can't return this number in constant-time should cache it.

      Specified by:
      numDocs in class IndexReader
    • maxDoc

      public int maxDoc()
      Description copied from class: IndexReader
      Returns one greater than the largest possible document number. This may be used to, e.g., determine how big to allocate an array which will have an element for every document number in an index.
      Specified by:
      maxDoc in class IndexReader
    • document

      public void document(int docID, StoredFieldVisitor visitor) throws IOException
      Description copied from class: IndexReader
      Expert: visits the fields of a stored document, for custom processing/loading of each field. If you simply want to load all fields, use IndexReader.document(int). If you want to load a subset, use DocumentStoredFieldVisitor.
      Specified by:
      document in class IndexReader
      Throws:
      IOException
    • getCoreCacheHelper

      public IndexReader.CacheHelper getCoreCacheHelper()
      Description copied from class: LeafReader
      Optional method: Return a IndexReader.CacheHelper that can be used to cache based on the content of this leaf regardless of deletions. Two readers that have the same data but different sets of deleted documents or doc values updates may be considered equal. Consider using IndexReader.getReaderCacheHelper() if you need deletions or dv updates to be taken into account.

      A return value of null indicates that this reader is not suited for caching, which is typically the case for short-lived wrappers that alter the content of the wrapped leaf reader.

      Specified by:
      getCoreCacheHelper in class LeafReader
    • getReaderCacheHelper

      public IndexReader.CacheHelper getReaderCacheHelper()
      Description copied from class: IndexReader
      Optional method: Return a IndexReader.CacheHelper that can be used to cache based on the content of this reader. Two readers that have different data or different sets of deleted documents will be considered different.

      A return value of null indicates that this reader is not suited for caching, which is typically the case for short-lived wrappers that alter the content of the wrapped reader.

      Specified by:
      getReaderCacheHelper in class IndexReader
    • getTermVectors

      public Fields getTermVectors(int docID) throws IOException
      Description copied from class: IndexReader
      Retrieve term vectors for this document, or null if term vectors were not indexed. The returned Fields instance acts like a single-document inverted index (the docID will be 0).
      Specified by:
      getTermVectors in class IndexReader
      Throws:
      IOException
    • doClose

      protected void doClose() throws IOException
      Description copied from class: IndexReader
      Implements close.
      Specified by:
      doClose in class IndexReader
      Throws:
      IOException
    • getNumericDocValues

      public NumericDocValues getNumericDocValues(String field) throws IOException
      Description copied from class: LeafReader
      Returns NumericDocValues for this field, or null if no numeric doc values were indexed for this field. The returned instance should only be used by a single thread.
      Specified by:
      getNumericDocValues in class LeafReader
      Throws:
      IOException
    • getBinaryDocValues

      public BinaryDocValues getBinaryDocValues(String field) throws IOException
      Description copied from class: LeafReader
      Returns BinaryDocValues for this field, or null if no binary doc values were indexed for this field. The returned instance should only be used by a single thread.
      Specified by:
      getBinaryDocValues in class LeafReader
      Throws:
      IOException
    • getSortedDocValues

      public SortedDocValues getSortedDocValues(String field) throws IOException
      Description copied from class: LeafReader
      Returns SortedDocValues for this field, or null if no SortedDocValues were indexed for this field. The returned instance should only be used by a single thread.
      Specified by:
      getSortedDocValues in class LeafReader
      Throws:
      IOException
    • getSortedNumericDocValues

      public SortedNumericDocValues getSortedNumericDocValues(String field) throws IOException
      Description copied from class: LeafReader
      Returns SortedNumericDocValues for this field, or null if no SortedNumericDocValues were indexed for this field. The returned instance should only be used by a single thread.
      Specified by:
      getSortedNumericDocValues in class LeafReader
      Throws:
      IOException
    • getSortedSetDocValues

      public SortedSetDocValues getSortedSetDocValues(String field) throws IOException
      Description copied from class: LeafReader
      Returns SortedSetDocValues for this field, or null if no SortedSetDocValues were indexed for this field. The returned instance should only be used by a single thread.
      Specified by:
      getSortedSetDocValues in class LeafReader
      Throws:
      IOException
    • getNormValues

      public NumericDocValues getNormValues(String field) throws IOException
      Description copied from class: LeafReader
      Returns NumericDocValues representing norms for this field, or null if no NumericDocValues were indexed. The returned instance should only be used by a single thread.
      Specified by:
      getNormValues in class LeafReader
      Throws:
      IOException
    • getPointValues

      public PointValues getPointValues(String fieldName) throws IOException
      Description copied from class: LeafReader
      Returns the PointValues used for numeric or spatial searches for the given field, or null if there are no point fields.
      Specified by:
      getPointValues in class LeafReader
      Throws:
      IOException
    • getVectorValues

      public VectorValues getVectorValues(String fieldName) throws IOException
      Description copied from class: LeafReader
      Returns VectorValues for this field, or null if no VectorValues were indexed. The returned instance should only be used by a single thread.
      Specified by:
      getVectorValues in class LeafReader
      Throws:
      IOException
    • searchNearestVectors

      public TopDocs searchNearestVectors(String fieldName, float[] target, int k, Bits acceptDocs, int visitedLimit) throws IOException
      Description copied from class: LeafReader
      Return the k nearest neighbor documents as determined by comparison of their vector values for this field, to the given vector, by the field's similarity function. The score of each document is derived from the vector similarity in a way that ensures scores are positive and that a larger score corresponds to a higher ranking.

      The search is allowed to be approximate, meaning the results are not guaranteed to be the true k closest neighbors. For large values of k (for example when k is close to the total number of documents), the search may also retrieve fewer than k documents.

      The returned TopDocs will contain a ScoreDoc for each nearest neighbor, sorted in order of their similarity to the query vector (decreasing scores). The TotalHits contains the number of documents visited during the search. If the search stopped early because it hit visitedLimit, it is indicated through the relation TotalHits.Relation.GREATER_THAN_OR_EQUAL_TO.

      Specified by:
      searchNearestVectors in class LeafReader
      Parameters:
      fieldName - the vector field to search
      target - the vector-valued query
      k - the number of docs to return
      acceptDocs - Bits that represents the allowed documents to match, or null if they are all allowed to match.
      visitedLimit - the maximum number of nodes that the search is allowed to visit
      Returns:
      the k nearest neighbor documents, along with their (searchStrategy-specific) scores.
      Throws:
      IOException
    • checkIntegrity

      public void checkIntegrity() throws IOException
      Description copied from class: LeafReader
      Checks consistency of this reader.

      Note that this may be costly in terms of I/O, e.g. may involve computing a checksum value against large data files.

      Specified by:
      checkIntegrity in class LeafReader
      Throws:
      IOException
    • getParallelReaders

      public LeafReader[] getParallelReaders()
      Returns the LeafReaders that were passed on init.
    • getMetaData

      public LeafMetaData getMetaData()
      Description copied from class: LeafReader
      Return metadata about this leaf.
      Specified by:
      getMetaData in class LeafReader