public abstract class IndexReader extends Object implements Closeable
IndexWriter
will not be visible until a new
IndexReader
is opened. It's best to use DirectoryReader.open(IndexWriter)
to obtain an
IndexReader
, if your IndexWriter
is
in-process. When you need to re-open to see changes to the
index, it's best to use DirectoryReader.openIfChanged(DirectoryReader)
since the new reader will share resources with the previous
one when possible. Search of an index is done entirely
through this abstract interface, so that any subclass which
implements it is searchable.
There are two different types of IndexReaders:
LeafReader
: These indexes do not consist of several sub-readers,
they are atomic. They support retrieval of stored fields, doc values, terms,
and postings.
CompositeReader
: Instances (like DirectoryReader
)
of this reader can only
be used to get stored fields from the underlying LeafReaders,
but it is not possible to directly retrieve postings. To do that, get
the sub-readers via CompositeReader.getSequentialSubReaders()
.
IndexReader instances for indexes on disk are usually constructed
with a call to one of the static DirectoryReader.open()
methods,
e.g. DirectoryReader.open(org.apache.lucene.store.Directory)
. DirectoryReader
implements
the CompositeReader
interface, it is not possible to directly get postings.
For efficiency, in this API documents are often referred to via document numbers, non-negative integers which each name a unique document in the index. These document numbers are ephemeral -- they may change as documents are added to and deleted from an index. Clients should thus not rely on a given document having the same number between sessions.
NOTE: IndexReader
instances are completely thread
safe, meaning multiple threads can call any of its methods,
concurrently. If your application requires external
synchronization, you should not synchronize on the
IndexReader
instance; use your own
(non-Lucene) objects instead.
Modifier and Type | Class and Description |
---|---|
static interface |
IndexReader.CacheHelper
A utility class that gives hooks in order to help build a cache based on
the data that is contained in this index.
|
static class |
IndexReader.CacheKey
A cache key identifying a resource that is being cached on.
|
static interface |
IndexReader.ClosedListener
A listener that is called when a resource gets closed.
|
Modifier and Type | Method and Description |
---|---|
void |
close()
Closes files associated with this index.
|
void |
decRef()
Expert: decreases the refCount of this IndexReader
instance.
|
abstract int |
docFreq(Term term)
Returns the number of documents containing the
term . |
protected abstract void |
doClose()
Implements close.
|
Document |
document(int docID)
Returns the stored fields of the
n th
Document in this index. |
Document |
document(int docID,
Set<String> fieldsToLoad)
Like
document(int) but only loads the specified
fields. |
abstract void |
document(int docID,
StoredFieldVisitor visitor)
Expert: visits the fields of a stored document, for
custom processing/loading of each field.
|
protected void |
ensureOpen()
Throws AlreadyClosedException if this IndexReader or any
of its child readers is closed, otherwise returns.
|
boolean |
equals(Object obj) |
abstract IndexReaderContext |
getContext()
Expert: Returns the root
IndexReaderContext for this
IndexReader 's sub-reader tree. |
abstract int |
getDocCount(String field)
Returns the number of documents that have at least one term for this field.
|
abstract IndexReader.CacheHelper |
getReaderCacheHelper()
Optional method: Return a
IndexReader.CacheHelper that can be used to cache
based on the content of this reader. |
int |
getRefCount()
Expert: returns the current refCount for this reader
|
abstract long |
getSumDocFreq(String field)
Returns the sum of
TermsEnum.docFreq() for all terms in this field. |
abstract long |
getSumTotalTermFreq(String field)
Returns the sum of
TermsEnum.totalTermFreq() for all terms in this
field. |
Terms |
getTermVector(int docID,
String field)
Retrieve term vector for this document and field, or
null if term vectors were not indexed.
|
abstract Fields |
getTermVectors(int docID)
Retrieve term vectors for this document, or null if
term vectors were not indexed.
|
boolean |
hasDeletions()
Returns true if any documents have been deleted.
|
int |
hashCode() |
void |
incRef()
Expert: increments the refCount of this IndexReader
instance.
|
List<LeafReaderContext> |
leaves()
Returns the reader's leaves, or itself if this reader is atomic.
|
abstract int |
maxDoc()
Returns one greater than the largest possible document number.
|
int |
numDeletedDocs()
Returns the number of deleted documents.
|
abstract int |
numDocs()
Returns the number of documents in this index.
|
void |
registerParentReader(IndexReader reader)
Expert: This method is called by
IndexReader s which wrap other readers
(e.g. |
abstract long |
totalTermFreq(Term term)
Returns the total number of occurrences of
term across all
documents (the sum of the freq() for each doc that has this term). |
boolean |
tryIncRef()
Expert: increments the refCount of this IndexReader
instance only if the IndexReader has not been closed yet
and returns
true iff the refCount was
successfully incremented, otherwise false . |
public final void registerParentReader(IndexReader reader)
IndexReader
s which wrap other readers
(e.g. CompositeReader
or FilterLeafReader
) to register the parent
at the child (this reader) on construction of the parent. When this reader is closed,
it will mark all registered parents as closed, too. The references to parent readers
are weak only, so they can be GCed once they are no longer in use.public final int getRefCount()
public final void incRef()
decRef()
, in a finally clause;
otherwise the reader may never be closed. Note that
close()
simply calls decRef(), which means that
the IndexReader will not really be closed until decRef()
has been called for all outstanding
references.decRef()
,
tryIncRef()
public final boolean tryIncRef()
true
iff the refCount was
successfully incremented, otherwise false
.
If this method returns false
the reader is either
already closed or is currently being closed. Either way this
reader instance shouldn't be used by an application unless
true
is returned.
RefCounts are used to determine when a
reader can be closed safely, i.e. as soon as there are
no more references. Be sure to always call a
corresponding decRef()
, in a finally clause;
otherwise the reader may never be closed. Note that
close()
simply calls decRef(), which means that
the IndexReader will not really be closed until decRef()
has been called for all outstanding
references.
public final void decRef() throws IOException
IOException
- in case an IOException occurs in doClose()incRef()
protected final void ensureOpen() throws AlreadyClosedException
AlreadyClosedException
public final boolean equals(Object obj)
IndexReader
subclasses are not allowed
to implement equals/hashCode, so methods are declared final.
public final int hashCode()
IndexReader
subclasses are not allowed
to implement equals/hashCode, so methods are declared final.
public abstract Fields getTermVectors(int docID) throws IOException
IOException
public final Terms getTermVector(int docID, String field) throws IOException
IOException
public abstract int numDocs()
NOTE: This operation may run in O(maxDoc). Implementations that can't return this number in constant-time should cache it.
public abstract int maxDoc()
public final int numDeletedDocs()
NOTE: This operation may run in O(maxDoc).
public abstract void document(int docID, StoredFieldVisitor visitor) throws IOException
document(int)
. If you want to load a subset, use
DocumentStoredFieldVisitor
.IOException
public final Document document(int docID) throws IOException
n
th
Document
in this index. This is just
sugar for using DocumentStoredFieldVisitor
.
NOTE: for performance reasons, this method does not check if the
requested document is deleted, and therefore asking for a deleted document
may yield unspecified results. Usually this is not required, however you
can test if the doc is deleted by checking the Bits
returned from MultiBits.getLiveDocs(org.apache.lucene.index.IndexReader)
.
NOTE: only the content of a field is returned,
if that field was stored during indexing. Metadata
like boost, omitNorm, IndexOptions, tokenized, etc.,
are not preserved.
CorruptIndexException
- if the index is corruptIOException
- if there is a low-level IO errorpublic final Document document(int docID, Set<String> fieldsToLoad) throws IOException
document(int)
but only loads the specified
fields. Note that this is simply sugar for DocumentStoredFieldVisitor(Set)
.IOException
public boolean hasDeletions()
public final void close() throws IOException
close
in interface Closeable
close
in interface AutoCloseable
IOException
- if there is a low-level IO errorprotected abstract void doClose() throws IOException
IOException
public abstract IndexReaderContext getContext()
IndexReaderContext
for this
IndexReader
's sub-reader tree.
Iff this reader is composed of sub
readers, i.e. this reader being a composite reader, this method returns a
CompositeReaderContext
holding the reader's direct children as well as a
view of the reader tree's atomic leaf contexts. All sub-
IndexReaderContext
instances referenced from this readers top-level
context are private to this reader and are not shared with another context
tree. For example, IndexSearcher uses this API to drive searching by one
atomic leaf reader at a time. If this reader is not composed of child
readers, this method returns an LeafReaderContext
.
Note: Any of the sub-CompositeReaderContext
instances referenced
from this top-level context do not support CompositeReaderContext.leaves()
.
Only the top-level context maintains the convenience leaf-view
for performance reasons.
public final List<LeafReaderContext> leaves()
this.getContext().leaves()
.IndexReaderContext.leaves()
public abstract IndexReader.CacheHelper getReaderCacheHelper()
IndexReader.CacheHelper
that can be used to cache
based on the content of this reader. Two readers that have different data
or different sets of deleted documents will be considered different.
A return value of null
indicates that this reader is not suited
for caching, which is typically the case for short-lived wrappers that
alter the content of the wrapped reader.
public abstract int docFreq(Term term) throws IOException
term
. This method returns 0 if the term or
field does not exists. This method does not take into
account deleted documents that have not yet been merged
away.IOException
TermsEnum.docFreq()
public abstract long totalTermFreq(Term term) throws IOException
term
across all
documents (the sum of the freq() for each doc that has this term).
Note that, like other term measures, this measure does not take
deleted documents into account.IOException
public abstract long getSumDocFreq(String field) throws IOException
TermsEnum.docFreq()
for all terms in this field.
Note that, just like other term measures, this measure does not take deleted
documents into account.IOException
Terms.getSumDocFreq()
public abstract int getDocCount(String field) throws IOException
IOException
Terms.getDocCount()
public abstract long getSumTotalTermFreq(String field) throws IOException
TermsEnum.totalTermFreq()
for all terms in this
field. Note that, just like other term measures, this measure does not take
deleted documents into account.IOException
Terms.getSumTotalTermFreq()
Copyright © 2000-2021 Apache Software Foundation. All Rights Reserved.