public abstract class IndexReader extends Object implements Closeable
There are two different types of IndexReaders:
AtomicReader
: These indexes do not consist of several sub-readers,
they are atomic. They support retrieval of stored fields, doc values, terms,
and postings.
CompositeReader
: Instances (like DirectoryReader
)
of this reader can only
be used to get stored fields from the underlying AtomicReaders,
but it is not possible to directly retrieve postings. To do that, get
the sub-readers via CompositeReader.getSequentialSubReaders()
.
Alternatively, you can mimic an AtomicReader
(with a serious slowdown),
by wrapping composite readers with SlowCompositeReaderWrapper
.
IndexReader instances for indexes on disk are usually constructed
with a call to one of the static DirectoryReader.open()
methods,
e.g. DirectoryReader.open(org.apache.lucene.store.Directory)
. DirectoryReader
implements
the CompositeReader
interface, it is not possible to directly get postings.
For efficiency, in this API documents are often referred to via document numbers, non-negative integers which each name a unique document in the index. These document numbers are ephemeral -- they may change as documents are added to and deleted from an index. Clients should thus not rely on a given document having the same number between sessions.
NOTE: IndexReader
instances are completely thread
safe, meaning multiple threads can call any of its methods,
concurrently. If your application requires external
synchronization, you should not synchronize on the
IndexReader
instance; use your own
(non-Lucene) objects instead.
Modifier and Type | Class and Description |
---|---|
static interface |
IndexReader.ReaderClosedListener
A custom listener that's invoked when the IndexReader
is closed.
|
Modifier and Type | Method and Description |
---|---|
void |
addReaderClosedListener(IndexReader.ReaderClosedListener listener)
Expert: adds a
IndexReader.ReaderClosedListener . |
void |
close()
Closes files associated with this index.
|
void |
decRef()
Expert: decreases the refCount of this IndexReader
instance.
|
abstract int |
docFreq(Term term)
Returns the number of documents containing the
term . |
protected abstract void |
doClose()
Implements close.
|
Document |
document(int docID)
Returns the stored fields of the
n th
Document in this index. |
Document |
document(int docID,
Set<String> fieldsToLoad)
Like
document(int) but only loads the specified
fields. |
abstract void |
document(int docID,
StoredFieldVisitor visitor)
Expert: visits the fields of a stored document, for
custom processing/loading of each field.
|
protected void |
ensureOpen()
Throws AlreadyClosedException if this IndexReader or any
of its child readers is closed, otherwise returns.
|
boolean |
equals(Object obj) |
Object |
getCombinedCoreAndDeletesKey()
Expert: Returns a key for this IndexReader that also includes deletions,
so FieldCache/CachingWrapperFilter can find it again.
|
abstract IndexReaderContext |
getContext()
Expert: Returns the root
IndexReaderContext for this
IndexReader 's sub-reader tree. |
Object |
getCoreCacheKey()
Expert: Returns a key for this IndexReader, so FieldCache/CachingWrapperFilter can find
it again.
|
abstract int |
getDocCount(String field)
Returns the number of documents that have at least one term for this field,
or -1 if this measure isn't stored by the codec.
|
int |
getRefCount()
Expert: returns the current refCount for this reader
|
abstract long |
getSumDocFreq(String field)
Returns the sum of
TermsEnum.docFreq() for all terms in this field,
or -1 if this measure isn't stored by the codec. |
abstract long |
getSumTotalTermFreq(String field)
Returns the sum of
TermsEnum.totalTermFreq() for all terms in this
field, or -1 if this measure isn't stored by the codec (or if this fields
omits term freq and positions). |
Terms |
getTermVector(int docID,
String field)
Retrieve term vector for this document and field, or
null if term vectors were not indexed.
|
abstract Fields |
getTermVectors(int docID)
Retrieve term vectors for this document, or null if
term vectors were not indexed.
|
boolean |
hasDeletions()
Returns true if any documents have been deleted.
|
int |
hashCode() |
void |
incRef()
Expert: increments the refCount of this IndexReader
instance.
|
List<AtomicReaderContext> |
leaves()
Returns the reader's leaves, or itself if this reader is atomic.
|
abstract int |
maxDoc()
Returns one greater than the largest possible document number.
|
int |
numDeletedDocs()
Returns the number of deleted documents.
|
abstract int |
numDocs()
Returns the number of documents in this index.
|
static DirectoryReader |
open(Directory directory)
Deprecated.
|
static DirectoryReader |
open(Directory directory,
int termInfosIndexDivisor)
Deprecated.
|
static DirectoryReader |
open(IndexCommit commit)
Deprecated.
|
static DirectoryReader |
open(IndexCommit commit,
int termInfosIndexDivisor)
Deprecated.
|
static DirectoryReader |
open(IndexWriter writer,
boolean applyAllDeletes)
Deprecated.
|
void |
registerParentReader(IndexReader reader)
Expert: This method is called by
IndexReader s which wrap other readers
(e.g. |
void |
removeReaderClosedListener(IndexReader.ReaderClosedListener listener)
Expert: remove a previously added
IndexReader.ReaderClosedListener . |
abstract long |
totalTermFreq(Term term)
Returns the total number of occurrences of
term across all
documents (the sum of the freq() for each doc that has this term). |
boolean |
tryIncRef()
Expert: increments the refCount of this IndexReader
instance only if the IndexReader has not been closed yet
and returns
true iff the refCount was
successfully incremented, otherwise false . |
public final void addReaderClosedListener(IndexReader.ReaderClosedListener listener)
IndexReader.ReaderClosedListener
. The
provided listener will be invoked when this reader is closed.public final void removeReaderClosedListener(IndexReader.ReaderClosedListener listener)
IndexReader.ReaderClosedListener
.public final void registerParentReader(IndexReader reader)
IndexReader
s which wrap other readers
(e.g. CompositeReader
or FilterAtomicReader
) to register the parent
at the child (this reader) on construction of the parent. When this reader is closed,
it will mark all registered parents as closed, too. The references to parent readers
are weak only, so they can be GCed once they are no longer in use.public final int getRefCount()
public final void incRef()
decRef()
, in a finally clause;
otherwise the reader may never be closed. Note that
close()
simply calls decRef(), which means that
the IndexReader will not really be closed until decRef()
has been called for all outstanding
references.decRef()
,
tryIncRef()
public final boolean tryIncRef()
true
iff the refCount was
successfully incremented, otherwise false
.
If this method returns false
the reader is either
already closed or is currently being closed. Either way this
reader instance shouldn't be used by an application unless
true
is returned.
RefCounts are used to determine when a
reader can be closed safely, i.e. as soon as there are
no more references. Be sure to always call a
corresponding decRef()
, in a finally clause;
otherwise the reader may never be closed. Note that
close()
simply calls decRef(), which means that
the IndexReader will not really be closed until decRef()
has been called for all outstanding
references.
public final void decRef() throws IOException
IOException
- in case an IOException occurs in doClose()incRef()
protected final void ensureOpen() throws AlreadyClosedException
AlreadyClosedException
public final boolean equals(Object obj)
For caching purposes, IndexReader
subclasses are not allowed
to implement equals/hashCode, so methods are declared final.
To lookup instances from caches use getCoreCacheKey()
and
getCombinedCoreAndDeletesKey()
.
public final int hashCode()
For caching purposes, IndexReader
subclasses are not allowed
to implement equals/hashCode, so methods are declared final.
To lookup instances from caches use getCoreCacheKey()
and
getCombinedCoreAndDeletesKey()
.
@Deprecated public static DirectoryReader open(Directory directory) throws IOException
DirectoryReader.open(Directory)
directory
- the index directoryIOException
- if there is a low-level IO error@Deprecated public static DirectoryReader open(Directory directory, int termInfosIndexDivisor) throws IOException
DirectoryReader.open(Directory,int)
directory
- the index directorytermInfosIndexDivisor
- Subsamples which indexed
terms are loaded into RAM. This has the same effect as IndexWriterConfig.setTermIndexInterval(int)
except that setting
must be done at indexing time while this setting can be
set per reader. When set to N, then one in every
N*termIndexInterval terms in the index is loaded into
memory. By setting this to a value > 1 you can reduce
memory usage, at the expense of higher latency when
loading a TermInfo. The default value is 1. Set this
to -1 to skip loading the terms index entirely.IOException
- if there is a low-level IO error@Deprecated public static DirectoryReader open(IndexWriter writer, boolean applyAllDeletes) throws IOException
DirectoryReader.open(IndexWriter,boolean)
IndexWriter
.writer
- The IndexWriter to open fromapplyAllDeletes
- If true, all buffered deletes will
be applied (made visible) in the returned reader. If
false, the deletes are not applied but remain buffered
(in IndexWriter) so that they will be applied in the
future. Applying deletes can be costly, so if your app
can tolerate deleted documents being returned you might
gain some performance by passing false.IOException
- if there is a low-level IO errorDirectoryReader.openIfChanged(DirectoryReader,IndexWriter,boolean)
@Deprecated public static DirectoryReader open(IndexCommit commit) throws IOException
DirectoryReader.open(IndexCommit)
IndexCommit
.commit
- the commit point to openIOException
- if there is a low-level IO error@Deprecated public static DirectoryReader open(IndexCommit commit, int termInfosIndexDivisor) throws IOException
DirectoryReader.open(IndexCommit,int)
IndexCommit
and termInfosIndexDivisor.commit
- the commit point to opentermInfosIndexDivisor
- Subsamples which indexed
terms are loaded into RAM. This has the same effect as IndexWriterConfig.setTermIndexInterval(int)
except that setting
must be done at indexing time while this setting can be
set per reader. When set to N, then one in every
N*termIndexInterval terms in the index is loaded into
memory. By setting this to a value > 1 you can reduce
memory usage, at the expense of higher latency when
loading a TermInfo. The default value is 1. Set this
to -1 to skip loading the terms index entirely.IOException
- if there is a low-level IO errorpublic abstract Fields getTermVectors(int docID) throws IOException
IOException
public final Terms getTermVector(int docID, String field) throws IOException
IOException
public abstract int numDocs()
public abstract int maxDoc()
public final int numDeletedDocs()
public abstract void document(int docID, StoredFieldVisitor visitor) throws IOException
document(int)
. If you want to load a subset, use
DocumentStoredFieldVisitor
.IOException
public final Document document(int docID) throws IOException
n
th
Document
in this index. This is just
sugar for using DocumentStoredFieldVisitor
.
NOTE: for performance reasons, this method does not check if the
requested document is deleted, and therefore asking for a deleted document
may yield unspecified results. Usually this is not required, however you
can test if the doc is deleted by checking the Bits
returned from MultiFields.getLiveDocs(org.apache.lucene.index.IndexReader)
.
NOTE: only the content of a field is returned,
if that field was stored during indexing. Metadata
like boost, omitNorm, IndexOptions, tokenized, etc.,
are not preserved.
IOException
- if there is a low-level IO errorpublic final Document document(int docID, Set<String> fieldsToLoad) throws IOException
document(int)
but only loads the specified
fields. Note that this is simply sugar for DocumentStoredFieldVisitor.DocumentStoredFieldVisitor(Set)
.IOException
public boolean hasDeletions()
public final void close() throws IOException
close
in interface Closeable
close
in interface AutoCloseable
IOException
- if there is a low-level IO errorprotected abstract void doClose() throws IOException
IOException
public abstract IndexReaderContext getContext()
IndexReaderContext
for this
IndexReader
's sub-reader tree.
Iff this reader is composed of sub
readers, i.e. this reader being a composite reader, this method returns a
CompositeReaderContext
holding the reader's direct children as well as a
view of the reader tree's atomic leaf contexts. All sub-
IndexReaderContext
instances referenced from this readers top-level
context are private to this reader and are not shared with another context
tree. For example, IndexSearcher uses this API to drive searching by one
atomic leaf reader at a time. If this reader is not composed of child
readers, this method returns an AtomicReaderContext
.
Note: Any of the sub-CompositeReaderContext
instances referenced
from this top-level context do not support CompositeReaderContext.leaves()
.
Only the top-level context maintains the convenience leaf-view
for performance reasons.
public final List<AtomicReaderContext> leaves()
this.getContext().leaves()
.IndexReaderContext.leaves()
public Object getCoreCacheKey()
public Object getCombinedCoreAndDeletesKey()
public abstract int docFreq(Term term) throws IOException
term
. This method returns 0 if the term or
field does not exists. This method does not take into
account deleted documents that have not yet been merged
away.IOException
TermsEnum.docFreq()
public abstract long totalTermFreq(Term term) throws IOException
term
across all
documents (the sum of the freq() for each doc that has this term). This
will be -1 if the codec doesn't support this measure. Note that, like other
term measures, this measure does not take deleted documents into account.IOException
public abstract long getSumDocFreq(String field) throws IOException
TermsEnum.docFreq()
for all terms in this field,
or -1 if this measure isn't stored by the codec. Note that, just like other
term measures, this measure does not take deleted documents into account.IOException
Terms.getSumDocFreq()
public abstract int getDocCount(String field) throws IOException
IOException
Terms.getDocCount()
public abstract long getSumTotalTermFreq(String field) throws IOException
TermsEnum.totalTermFreq()
for all terms in this
field, or -1 if this measure isn't stored by the codec (or if this fields
omits term freq and positions). Note that, just like other term measures,
this measure does not take deleted documents into account.IOException
Terms.getSumTotalTermFreq()
Copyright © 2000-2014 Apache Software Foundation. All Rights Reserved.