Class LeafReader

  • All Implemented Interfaces:
    Closeable, AutoCloseable
    Direct Known Subclasses:
    CodecReader, FilterLeafReader, ParallelLeafReader

    public abstract class LeafReader
    extends IndexReader
    LeafReader is an abstract class, providing an interface for accessing an index. Search of an index is done entirely through this abstract interface, so that any subclass which implements it is searchable. IndexReaders implemented by this subclass do not consist of several sub-readers, they are atomic. They support retrieval of stored fields, doc values, terms, and postings.

    For efficiency, in this API documents are often referred to via document numbers, non-negative integers which each name a unique document in the index. These document numbers are ephemeral -- they may change as documents are added to and deleted from an index. Clients should thus not rely on a given document having the same number between sessions.

    NOTE: IndexReader instances are completely thread safe, meaning multiple threads can call any of its methods, concurrently. If your application requires external synchronization, you should not synchronize on the IndexReader instance; use your own (non-Lucene) objects instead.

    • Constructor Detail

      • LeafReader

        protected LeafReader()
        Sole constructor. (For invocation by subclass constructors, typically implicit.)
    • Method Detail

      • getContext

        public final LeafReaderContext getContext()
        Description copied from class: IndexReader
        Expert: Returns the root IndexReaderContext for this IndexReader's sub-reader tree.

        Iff this reader is composed of sub readers, i.e. this reader being a composite reader, this method returns a CompositeReaderContext holding the reader's direct children as well as a view of the reader tree's atomic leaf contexts. All sub- IndexReaderContext instances referenced from this readers top-level context are private to this reader and are not shared with another context tree. For example, IndexSearcher uses this API to drive searching by one atomic leaf reader at a time. If this reader is not composed of child readers, this method returns an LeafReaderContext.

        Note: Any of the sub-CompositeReaderContext instances referenced from this top-level context do not support CompositeReaderContext.leaves(). Only the top-level context maintains the convenience leaf-view for performance reasons.

        Specified by:
        getContext in class IndexReader
      • getCoreCacheHelper

        public abstract IndexReader.CacheHelper getCoreCacheHelper()
        Optional method: Return a IndexReader.CacheHelper that can be used to cache based on the content of this leaf regardless of deletions. Two readers that have the same data but different sets of deleted documents or doc values updates may be considered equal. Consider using IndexReader.getReaderCacheHelper() if you need deletions or dv updates to be taken into account.

        A return value of null indicates that this reader is not suited for caching, which is typically the case for short-lived wrappers that alter the content of the wrapped leaf reader.

        WARNING: This API is experimental and might change in incompatible ways in the next release.
      • docFreq

        public final int docFreq​(Term term)
                          throws IOException
        Description copied from class: IndexReader
        Returns the number of documents containing the term. This method returns 0 if the term or field does not exists. This method does not take into account deleted documents that have not yet been merged away.
        Specified by:
        docFreq in class IndexReader
        Throws:
        IOException
        See Also:
        TermsEnum.docFreq()
      • totalTermFreq

        public final long totalTermFreq​(Term term)
                                 throws IOException
        Returns the number of documents containing the term t. This method returns 0 if the term or field does not exists. This method does not take into account deleted documents that have not yet been merged away.
        Specified by:
        totalTermFreq in class IndexReader
        Throws:
        IOException
      • getDocCount

        public final int getDocCount​(String field)
                              throws IOException
        Description copied from class: IndexReader
        Returns the number of documents that have at least one term for this field. Note that, just like other term measures, this measure does not take deleted documents into account.
        Specified by:
        getDocCount in class IndexReader
        Throws:
        IOException
        See Also:
        Terms.getDocCount()
      • getBinaryDocValues

        public abstract BinaryDocValues getBinaryDocValues​(String field)
                                                    throws IOException
        Returns BinaryDocValues for this field, or null if no binary doc values were indexed for this field. The returned instance should only be used by a single thread.
        Throws:
        IOException
      • getVectorValues

        public abstract VectorValues getVectorValues​(String field)
                                              throws IOException
        Returns VectorValues for this field, or null if no VectorValues were indexed. The returned instance should only be used by a single thread.
        Throws:
        IOException
        WARNING: This API is experimental and might change in incompatible ways in the next release.
      • searchNearestVectors

        public abstract TopDocs searchNearestVectors​(String field,
                                                     float[] target,
                                                     int k,
                                                     Bits acceptDocs,
                                                     int visitedLimit)
                                              throws IOException
        Return the k nearest neighbor documents as determined by comparison of their vector values for this field, to the given vector, by the field's similarity function. The score of each document is derived from the vector similarity in a way that ensures scores are positive and that a larger score corresponds to a higher ranking.

        The search is allowed to be approximate, meaning the results are not guaranteed to be the true k closest neighbors. For large values of k (for example when k is close to the total number of documents), the search may also retrieve fewer than k documents.

        The returned TopDocs will contain a ScoreDoc for each nearest neighbor, sorted in order of their similarity to the query vector (decreasing scores). The TotalHits contains the number of documents visited during the search. If the search stopped early because it hit visitedLimit, it is indicated through the relation TotalHits.Relation.GREATER_THAN_OR_EQUAL_TO.

        Parameters:
        field - the vector field to search
        target - the vector-valued query
        k - the number of docs to return
        acceptDocs - Bits that represents the allowed documents to match, or null if they are all allowed to match.
        visitedLimit - the maximum number of nodes that the search is allowed to visit
        Returns:
        the k nearest neighbor documents, along with their (searchStrategy-specific) scores.
        Throws:
        IOException
        WARNING: This API is experimental and might change in incompatible ways in the next release.
      • getFieldInfos

        public abstract FieldInfos getFieldInfos()
        Get the FieldInfos describing all fields in this reader.

        Note: Implementations should cache the FieldInfos instance returned by this method such that subsequent calls to this method return the same instance.

        WARNING: This API is experimental and might change in incompatible ways in the next release.
      • getLiveDocs

        public abstract Bits getLiveDocs()
        Returns the Bits representing live (not deleted) docs. A set bit indicates the doc ID has not been deleted. If this method returns null it means there are no deleted documents (all documents are live).

        The returned instance has been safely published for use by multiple threads without additional synchronization.

      • checkIntegrity

        public abstract void checkIntegrity()
                                     throws IOException
        Checks consistency of this reader.

        Note that this may be costly in terms of I/O, e.g. may involve computing a checksum value against large data files.

        Throws:
        IOException
        NOTE: This API is for internal purposes only and might change in incompatible ways in the next release.
      • getMetaData

        public abstract LeafMetaData getMetaData()
        Return metadata about this leaf.
        WARNING: This API is experimental and might change in incompatible ways in the next release.