Class LeafReader
- All Implemented Interfaces:
Closeable,AutoCloseable
- Direct Known Subclasses:
CodecReader,FilterLeafReader,ParallelLeafReader
LeafReader is an abstract class, providing an interface for accessing an index. Search of
an index is done entirely through this abstract interface, so that any subclass which implements
it is searchable. IndexReaders implemented by this subclass do not consist of several
sub-readers, they are atomic. They support retrieval of stored fields, doc values, terms, and
postings.
For efficiency, in this API documents are often referred to via document numbers, non-negative integers which each name a unique document in the index. These document numbers are ephemeral -- they may change as documents are added to and deleted from an index. Clients should thus not rely on a given document having the same number between sessions.
NOTE: IndexReader instances are completely thread safe, meaning multiple
threads can call any of its methods, concurrently. If your application requires external
synchronization, you should not synchronize on the IndexReader instance; use
your own (non-Lucene) objects instead.
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.lucene.index.IndexReader
IndexReader.CacheHelper, IndexReader.CacheKey, IndexReader.ClosedListener -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionabstract voidChecks consistency of this reader.final intReturns the number of documents containing theterm.abstract BinaryDocValuesgetBinaryDocValues(String field) ReturnsBinaryDocValuesfor this field, or null if no binary doc values were indexed for this field.abstract ByteVectorValuesgetByteVectorValues(String field) ReturnsByteVectorValuesfor this field, or null if noByteVectorValueswere indexed.final LeafReaderContextExpert: Returns the rootIndexReaderContextfor thisIndexReader's sub-reader tree.abstract IndexReader.CacheHelperOptional method: Return aIndexReader.CacheHelperthat can be used to cache based on the content of this leaf regardless of deletions.final intgetDocCount(String field) Returns the number of documents that have at least one term for this field.abstract DocValuesSkippergetDocValuesSkipper(String field) Returns aDocValuesSkipperallowing skipping ranges of doc IDs that are not of interest, ornullif a skip index was not indexed.abstract FieldInfosGet theFieldInfosdescribing all fields in this reader.abstract FloatVectorValuesgetFloatVectorValues(String field) ReturnsFloatVectorValuesfor this field, or null if noFloatVectorValueswere indexed.abstract BitsReturns theBitsrepresenting live (not deleted) docs.abstract LeafMetaDataReturn metadata about this leaf.abstract NumericDocValuesgetNormValues(String field) ReturnsNumericDocValuesrepresenting norms for this field, or null if noNumericDocValueswere indexed.abstract NumericDocValuesgetNumericDocValues(String field) ReturnsNumericDocValuesfor this field, or null if no numeric doc values were indexed for this field.abstract PointValuesgetPointValues(String field) Returns thePointValuesused for numeric or spatial searches for the given field, or null if there are no point fields.abstract SortedDocValuesgetSortedDocValues(String field) ReturnsSortedDocValuesfor this field, or null if noSortedDocValueswere indexed for this field.abstract SortedNumericDocValuesgetSortedNumericDocValues(String field) ReturnsSortedNumericDocValuesfor this field, or null if noSortedNumericDocValueswere indexed for this field.abstract SortedSetDocValuesgetSortedSetDocValues(String field) ReturnsSortedSetDocValuesfor this field, or null if noSortedSetDocValueswere indexed for this field.final longgetSumDocFreq(String field) Returns the sum ofTermsEnum.docFreq()for all terms in this field.final longgetSumTotalTermFreq(String field) Returns the sum ofTermsEnum.totalTermFreq()for all terms in this field.final PostingsEnumReturnsPostingsEnumfor the specified term withPostingsEnum.FREQS.final PostingsEnumReturnsPostingsEnumfor the specified term.final TopDocssearchNearestVectors(String field, byte[] target, int k, AcceptDocs acceptDocs, int visitedLimit) Return the k nearest neighbor documents as determined by comparison of their vector values for this field, to the given vector, by the field's similarity function.abstract voidsearchNearestVectors(String field, byte[] target, KnnCollector knnCollector, AcceptDocs acceptDocs) Return the k nearest neighbor documents as determined by comparison of their vector values for this field, to the given vector, by the field's similarity function.final TopDocssearchNearestVectors(String field, float[] target, int k, AcceptDocs acceptDocs, int visitedLimit) Return the k nearest neighbor documents as determined by comparison of their vector values for this field, to the given vector, by the field's similarity function.abstract voidsearchNearestVectors(String field, float[] target, KnnCollector knnCollector, AcceptDocs acceptDocs) Return the k nearest neighbor documents as determined by comparison of their vector values for this field, to the given vector, by the field's similarity function.abstract TermsReturns theTermsindex for this field, or null if it has none.final longtotalTermFreq(Term term) Returns the number of documents containing the termt.Methods inherited from class org.apache.lucene.index.IndexReader
close, decRef, doClose, ensureOpen, equals, getReaderCacheHelper, getRefCount, hasDeletions, hashCode, incRef, leaves, maxDoc, notifyReaderClosedListeners, numDeletedDocs, numDocs, registerParentReader, storedFields, termVectors, tryIncRef
-
Constructor Details
-
LeafReader
protected LeafReader()Sole constructor. (For invocation by subclass constructors, typically implicit.)
-
-
Method Details
-
getContext
Description copied from class:IndexReaderExpert: Returns the rootIndexReaderContextfor thisIndexReader's sub-reader tree.Iff this reader is composed of sub readers, i.e. this reader being a composite reader, this method returns a
CompositeReaderContextholding the reader's direct children as well as a view of the reader tree's atomic leaf contexts. All sub-IndexReaderContextinstances referenced from this readers top-level context are private to this reader and are not shared with another context tree. For example, IndexSearcher uses this API to drive searching by one atomic leaf reader at a time. If this reader is not composed of child readers, this method returns anLeafReaderContext.Note: Any of the sub-
CompositeReaderContextinstances referenced from this top-level context do not supportCompositeReaderContext.leaves(). Only the top-level context maintains the convenience leaf-view for performance reasons.- Specified by:
getContextin classIndexReader
-
getCoreCacheHelper
Optional method: Return aIndexReader.CacheHelperthat can be used to cache based on the content of this leaf regardless of deletions. Two readers that have the same data but different sets of deleted documents or doc values updates may be considered equal. Consider usingIndexReader.getReaderCacheHelper()if you need deletions or dv updates to be taken into account.A return value of
nullindicates that this reader is not suited for caching, which is typically the case for short-lived wrappers that alter the content of the wrapped leaf reader.- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
docFreq
Description copied from class:IndexReaderReturns the number of documents containing theterm. This method returns 0 if the term or field does not exists. This method does not take into account deleted documents that have not yet been merged away.- Specified by:
docFreqin classIndexReader- Throws:
IOException- See Also:
-
totalTermFreq
Returns the number of documents containing the termt. This method returns 0 if the term or field does not exists. This method does not take into account deleted documents that have not yet been merged away.- Specified by:
totalTermFreqin classIndexReader- Throws:
IOException
-
getSumDocFreq
Description copied from class:IndexReaderReturns the sum ofTermsEnum.docFreq()for all terms in this field. Note that, just like other term measures, this measure does not take deleted documents into account.- Specified by:
getSumDocFreqin classIndexReader- Throws:
IOException- See Also:
-
getDocCount
Description copied from class:IndexReaderReturns the number of documents that have at least one term for this field. Note that, just like other term measures, this measure does not take deleted documents into account.- Specified by:
getDocCountin classIndexReader- Throws:
IOException- See Also:
-
getSumTotalTermFreq
Description copied from class:IndexReaderReturns the sum ofTermsEnum.totalTermFreq()for all terms in this field. Note that, just like other term measures, this measure does not take deleted documents into account.- Specified by:
getSumTotalTermFreqin classIndexReader- Throws:
IOException- See Also:
-
terms
Returns theTermsindex for this field, or null if it has none.- Throws:
IOException
-
postings
ReturnsPostingsEnumfor the specified term. This will return null if either the field or term does not exist.NOTE: The returned
PostingsEnummay contain deleted docs.- Throws:
IOException- See Also:
-
postings
ReturnsPostingsEnumfor the specified term withPostingsEnum.FREQS.Use this method if you only require documents and frequencies, and do not need any proximity data. This method is equivalent to
postings(term, PostingsEnum.FREQS)NOTE: The returned
PostingsEnummay contain deleted docs.- Throws:
IOException- See Also:
-
getNumericDocValues
ReturnsNumericDocValuesfor this field, or null if no numeric doc values were indexed for this field. The returned instance should only be used by a single thread.- Throws:
IOException
-
getBinaryDocValues
ReturnsBinaryDocValuesfor this field, or null if no binary doc values were indexed for this field. The returned instance should only be used by a single thread.- Throws:
IOException
-
getSortedDocValues
ReturnsSortedDocValuesfor this field, or null if noSortedDocValueswere indexed for this field. The returned instance should only be used by a single thread.- Throws:
IOException
-
getSortedNumericDocValues
ReturnsSortedNumericDocValuesfor this field, or null if noSortedNumericDocValueswere indexed for this field. The returned instance should only be used by a single thread.- Throws:
IOException
-
getSortedSetDocValues
ReturnsSortedSetDocValuesfor this field, or null if noSortedSetDocValueswere indexed for this field. The returned instance should only be used by a single thread.- Throws:
IOException
-
getNormValues
ReturnsNumericDocValuesrepresenting norms for this field, or null if noNumericDocValueswere indexed. The returned instance should only be used by a single thread.- Throws:
IOException
-
getDocValuesSkipper
Returns aDocValuesSkipperallowing skipping ranges of doc IDs that are not of interest, ornullif a skip index was not indexed. The returned instance should be confined to the thread that created it.- Throws:
IOException
-
getFloatVectorValues
ReturnsFloatVectorValuesfor this field, or null if noFloatVectorValueswere indexed. The returned instance should only be used by a single thread.- Throws:
IOException- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
getByteVectorValues
ReturnsByteVectorValuesfor this field, or null if noByteVectorValueswere indexed. The returned instance should only be used by a single thread.- Throws:
IOException- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
searchNearestVectors
public final TopDocs searchNearestVectors(String field, float[] target, int k, AcceptDocs acceptDocs, int visitedLimit) throws IOException Return the k nearest neighbor documents as determined by comparison of their vector values for this field, to the given vector, by the field's similarity function. The score of each document is derived from the vector similarity in a way that ensures scores are positive and that a larger score corresponds to a higher ranking.The search is allowed to be approximate, meaning the results are not guaranteed to be the true k closest neighbors. For large values of k (for example when k is close to the total number of documents), the search may also retrieve fewer than k documents.
The returned
TopDocswill contain aScoreDocfor each nearest neighbor, sorted in order of their similarity to the query vector (decreasing scores). TheTotalHitscontains the number of documents visited during the search. If the search stopped early because it hitvisitedLimit, it is indicated through the relationTotalHits.Relation.GREATER_THAN_OR_EQUAL_TO.- Parameters:
field- the vector field to searchtarget- the vector-valued queryk- the number of docs to returnacceptDocs-AcceptDocsthat represents the allowed documents to matchvisitedLimit- the maximum number of nodes that the search is allowed to visit- Returns:
- the k nearest neighbor documents, along with their (searchStrategy-specific) scores.
- Throws:
IOException- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
searchNearestVectors
public final TopDocs searchNearestVectors(String field, byte[] target, int k, AcceptDocs acceptDocs, int visitedLimit) throws IOException Return the k nearest neighbor documents as determined by comparison of their vector values for this field, to the given vector, by the field's similarity function. The score of each document is derived from the vector similarity in a way that ensures scores are positive and that a larger score corresponds to a higher ranking.The search is allowed to be approximate, meaning the results are not guaranteed to be the true k closest neighbors. For large values of k (for example when k is close to the total number of documents), the search may also retrieve fewer than k documents.
The returned
TopDocswill contain aScoreDocfor each nearest neighbor, sorted in order of their similarity to the query vector (decreasing scores). TheTotalHitscontains the number of documents visited during the search. If the search stopped early because it hitvisitedLimit, it is indicated through the relationTotalHits.Relation.GREATER_THAN_OR_EQUAL_TO.- Parameters:
field- the vector field to searchtarget- the vector-valued queryk- the number of docs to returnacceptDocs-AcceptDocsthat represents the allowed documents to matchvisitedLimit- the maximum number of nodes that the search is allowed to visit- Returns:
- the k nearest neighbor documents, along with their (searchStrategy-specific) scores.
- Throws:
IOException- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
searchNearestVectors
public abstract void searchNearestVectors(String field, float[] target, KnnCollector knnCollector, AcceptDocs acceptDocs) throws IOException Return the k nearest neighbor documents as determined by comparison of their vector values for this field, to the given vector, by the field's similarity function. The score of each document is derived from the vector similarity in a way that ensures scores are positive and that a larger score corresponds to a higher ranking.The search is allowed to be approximate, meaning the results are not guaranteed to be the true k closest neighbors. For large values of k (for example when k is close to the total number of documents), the search may also retrieve fewer than k documents.
The returned
TopDocswill contain aScoreDocfor each nearest neighbor, in order of their similarity to the query vector (decreasing scores). TheTotalHitscontains the number of documents visited during the search. If the search stopped early because it hitvisitedLimit, it is indicated through the relationTotalHits.Relation.GREATER_THAN_OR_EQUAL_TO.The behavior is undefined if the given field doesn't have KNN vectors enabled on its
FieldInfo. The return value is nevernull.- Parameters:
field- the vector field to searchtarget- the vector-valued queryknnCollector- collector with settings for gathering the vector results.acceptDocs-AcceptDocsthat represents the allowed documents to match- Throws:
IOException- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
searchNearestVectors
public abstract void searchNearestVectors(String field, byte[] target, KnnCollector knnCollector, AcceptDocs acceptDocs) throws IOException Return the k nearest neighbor documents as determined by comparison of their vector values for this field, to the given vector, by the field's similarity function. The score of each document is derived from the vector similarity in a way that ensures scores are positive and that a larger score corresponds to a higher ranking.The search is allowed to be approximate, meaning the results are not guaranteed to be the true k closest neighbors. For large values of k (for example when k is close to the total number of documents), the search may also retrieve fewer than k documents.
The returned
TopDocswill contain aScoreDocfor each nearest neighbor, in order of their similarity to the query vector (decreasing scores). TheTotalHitscontains the number of documents visited during the search. If the search stopped early because it hitvisitedLimit, it is indicated through the relationTotalHits.Relation.GREATER_THAN_OR_EQUAL_TO.The behavior is undefined if the given field doesn't have KNN vectors enabled on its
FieldInfo. The return value is nevernull.- Parameters:
field- the vector field to searchtarget- the vector-valued queryknnCollector- collector with settings for gathering the vector results.acceptDocs-AcceptDocsthat represents the allowed documents to match- Throws:
IOException- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
getFieldInfos
Get theFieldInfosdescribing all fields in this reader.Note: Implementations should cache the FieldInfos instance returned by this method such that subsequent calls to this method return the same instance.
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
getLiveDocs
Returns theBitsrepresenting live (not deleted) docs. A set bit indicates the doc ID has not been deleted. If this method returns null it means there are no deleted documents (all documents are live).The returned instance has been safely published for use by multiple threads without additional synchronization.
-
getPointValues
Returns thePointValuesused for numeric or spatial searches for the given field, or null if there are no point fields.- Throws:
IOException
-
checkIntegrity
Checks consistency of this reader.Note that this may be costly in terms of I/O, e.g. may involve computing a checksum value against large data files.
- Throws:
IOException- NOTE: This API is for internal purposes only and might change in incompatible ways in the next release.
-
getMetaData
Return metadata about this leaf.- WARNING: This API is experimental and might change in incompatible ways in the next release.
-