public abstract class IndexReader extends Object implements Cloneable, Closeable
Concrete subclasses of IndexReader are usually constructed with a call to
one of the static open()
methods, e.g. open(Directory, boolean)
.
For efficiency, in this API documents are often referred to via document numbers, non-negative integers which each name a unique document in the index. These document numbers are ephemeral--they may change as documents are added to and deleted from an index. Clients should thus not rely on a given document having the same number between sessions.
An IndexReader can be opened on a directory for which an IndexWriter is opened already, but it cannot be used to delete documents from the index then.
NOTE: for backwards API compatibility, several methods are not listed as abstract, but have no useful implementations in this base class and instead always throw UnsupportedOperationException. Subclasses are strongly encouraged to override these methods, but in many cases may not need to.
NOTE: as of 2.4, it's possible to open a read-only IndexReader using the static open methods that accept the boolean readOnly parameter. Such a reader has better concurrency as it's not necessary to synchronize on the isDeleted method. You must specify false if you want to make changes with the resulting IndexReader.
NOTE: IndexReader
instances are completely thread
safe, meaning multiple threads can call any of its methods,
concurrently. If your application requires external
synchronization, you should not synchronize on the
IndexReader
instance; use your own
(non-Lucene) objects instead.
Modifier and Type | Class and Description |
---|---|
static interface |
IndexReader.ReaderClosedListener
A custom listener that's invoked when the IndexReader
is closed.
|
Modifier and Type | Field and Description |
---|---|
protected boolean |
hasChanges |
Modifier | Constructor and Description |
---|---|
protected |
IndexReader() |
Modifier and Type | Method and Description |
---|---|
protected void |
acquireWriteLock()
Deprecated.
Write support will be removed in Lucene 4.0.
|
void |
addReaderClosedListener(IndexReader.ReaderClosedListener listener)
Expert: adds a
IndexReader.ReaderClosedListener . |
Object |
clone()
Efficiently clones the IndexReader (sharing most
internal state).
|
IndexReader |
clone(boolean openReadOnly)
Deprecated.
Write support will be removed in Lucene 4.0.
Use
clone() instead. |
void |
close()
Closes files associated with this index.
|
protected void |
commit()
Deprecated.
Write support will be removed in Lucene 4.0.
|
void |
commit(Map<String,String> commitUserData)
Deprecated.
Write support will be removed in Lucene 4.0.
|
void |
decRef()
Expert: decreases the refCount of this IndexReader
instance.
|
void |
deleteDocument(int docNum)
Deprecated.
Write support will be removed in Lucene 4.0.
Use
IndexWriter.deleteDocuments(Term) instead |
int |
deleteDocuments(Term term)
Deprecated.
Write support will be removed in Lucene 4.0.
Use
IndexWriter.deleteDocuments(Term) instead |
Directory |
directory()
Returns the directory associated with this index.
|
abstract int |
docFreq(Term t)
Returns the number of documents containing the term
t . |
protected abstract void |
doClose()
Implements close.
|
protected abstract void |
doCommit(Map<String,String> commitUserData)
Deprecated.
Write support will be removed in Lucene 4.0.
|
Document |
document(int n)
Returns the stored fields of the
n th
Document in this index. |
abstract Document |
document(int n,
FieldSelector fieldSelector)
Get the
Document at the n
th position. |
protected abstract void |
doDelete(int docNum)
Deprecated.
Write support will be removed in Lucene 4.0.
Use
IndexWriter.deleteDocuments(Term) instead |
protected IndexReader |
doOpenIfChanged()
If the index has changed since it was opened, open and return a new reader;
else, return
null . |
protected IndexReader |
doOpenIfChanged(boolean openReadOnly)
Deprecated.
Write support will be removed in Lucene 4.0.
Use
doOpenIfChanged() instead |
protected IndexReader |
doOpenIfChanged(IndexCommit commit)
If the index has changed since it was opened, open and return a new reader;
else, return
null . |
protected IndexReader |
doOpenIfChanged(IndexWriter writer,
boolean applyAllDeletes)
If the index has changed since it was opened, open and return a new reader;
else, return
null . |
protected abstract void |
doSetNorm(int doc,
String field,
byte value)
Deprecated.
Write support will be removed in Lucene 4.0.
There will be no replacement for this method.
|
protected abstract void |
doUndeleteAll()
Deprecated.
Write support will be removed in Lucene 4.0.
There will be no replacement for this method.
|
protected void |
ensureOpen() |
void |
flush()
Deprecated.
Write support will be removed in Lucene 4.0.
|
void |
flush(Map<String,String> commitUserData)
Deprecated.
Write support will be removed in Lucene 4.0.
|
Map<String,String> |
getCommitUserData()
Deprecated.
Call
getIndexCommit() and then call
IndexCommit.getUserData() . |
static Map<String,String> |
getCommitUserData(Directory directory)
Deprecated.
Call
getIndexCommit() on an open IndexReader, and then call
IndexCommit.getUserData() . |
Object |
getCoreCacheKey()
Expert
|
static long |
getCurrentVersion(Directory directory)
Deprecated.
Use
getVersion() on an opened IndexReader. |
Object |
getDeletesCacheKey()
Expert.
|
abstract FieldInfos |
getFieldInfos()
Get the
FieldInfos describing all fields in
this reader. |
IndexCommit |
getIndexCommit()
Expert: return the IndexCommit that this reader has
opened.
|
int |
getRefCount()
Expert: returns the current refCount for this reader
|
IndexReader[] |
getSequentialSubReaders()
Expert: returns the sequential sub readers that this
reader is logically composed of.
|
abstract TermFreqVector |
getTermFreqVector(int docNumber,
String field)
Return a term frequency vector for the specified document and field.
|
abstract void |
getTermFreqVector(int docNumber,
String field,
TermVectorMapper mapper)
Load the Term Vector into a user-defined data structure instead of relying on the parallel arrays of
the
TermFreqVector . |
abstract void |
getTermFreqVector(int docNumber,
TermVectorMapper mapper)
Map all the term vectors for all fields in a Document
|
abstract TermFreqVector[] |
getTermFreqVectors(int docNumber)
Return an array of term frequency vectors for the specified document.
|
int |
getTermInfosIndexDivisor()
For IndexReader implementations that use
TermInfosReader to read terms, this returns the
current indexDivisor as specified when the reader was
opened.
|
long |
getUniqueTermCount()
Returns the number of unique terms (across all fields)
in this reader.
|
long |
getVersion()
Version number when this IndexReader was opened.
|
abstract boolean |
hasDeletions()
Returns true if any documents have been deleted
|
boolean |
hasNorms(String field)
Returns true if there are norms stored for this field.
|
void |
incRef()
Expert: increments the refCount of this IndexReader
instance.
|
static boolean |
indexExists(Directory directory)
Returns
true if an index exists at the specified directory. |
boolean |
isCurrent()
Check whether any new changes have occurred to the
index since this reader was opened.
|
abstract boolean |
isDeleted(int n)
Returns true if document n has been deleted
|
boolean |
isOptimized()
Deprecated.
Check segment count using
getSequentialSubReaders() instead. |
static long |
lastModified(Directory directory2)
Deprecated.
If you need to track commit time of
an index, you can store it in the commit data (see
IndexWriter.commit(Map) |
static Collection<IndexCommit> |
listCommits(Directory dir)
Returns all commit points that exist in the Directory.
|
abstract int |
maxDoc()
Returns one greater than the largest possible document number.
|
abstract byte[] |
norms(String field)
Returns the byte-encoded normalization factor for the named field of
every document.
|
abstract void |
norms(String field,
byte[] bytes,
int offset)
Reads the byte-encoded normalization factor for the named field of every
document.
|
int |
numDeletedDocs()
Returns the number of deleted documents.
|
abstract int |
numDocs()
Returns the number of documents in this index.
|
static IndexReader |
open(Directory directory)
Returns a IndexReader reading the index in the given
Directory, with readOnly=true.
|
static IndexReader |
open(Directory directory,
boolean readOnly)
Deprecated.
Write support will be removed in Lucene 4.0.
Use
open(Directory) instead |
static IndexReader |
open(Directory directory,
IndexDeletionPolicy deletionPolicy,
boolean readOnly)
Deprecated.
Write support will be removed in Lucene 4.0.
Use
open(Directory) instead |
static IndexReader |
open(Directory directory,
IndexDeletionPolicy deletionPolicy,
boolean readOnly,
int termInfosIndexDivisor)
Deprecated.
Write support will be removed in Lucene 4.0.
Use
open(Directory,int) instead |
static IndexReader |
open(Directory directory,
int termInfosIndexDivisor)
Expert: Returns a IndexReader reading the index in the given
Director and given termInfosIndexDivisor
|
static IndexReader |
open(IndexCommit commit)
Expert: returns an IndexReader reading the index in the given
IndexCommit . |
static IndexReader |
open(IndexCommit commit,
boolean readOnly)
Deprecated.
Write support will be removed in Lucene 4.0.
Use
open(IndexCommit) instead |
static IndexReader |
open(IndexCommit commit,
IndexDeletionPolicy deletionPolicy,
boolean readOnly)
Deprecated.
Write support will be removed in Lucene 4.0.
Use
open(IndexCommit) instead |
static IndexReader |
open(IndexCommit commit,
IndexDeletionPolicy deletionPolicy,
boolean readOnly,
int termInfosIndexDivisor)
Deprecated.
Write support will be removed in Lucene 4.0.
Use
open(IndexCommit,int) instead |
static IndexReader |
open(IndexCommit commit,
int termInfosIndexDivisor)
Expert: returns an IndexReader reading the index in the given
IndexCommit and termInfosIndexDivisor. |
static IndexReader |
open(IndexWriter writer,
boolean applyAllDeletes)
Open a near real time IndexReader from the
IndexWriter . |
static IndexReader |
openIfChanged(IndexReader oldReader)
If the index has changed since the provided reader was
opened, open and return a new reader; else, return
null.
|
static IndexReader |
openIfChanged(IndexReader oldReader,
boolean readOnly)
Deprecated.
Write support will be removed in Lucene 4.0.
Use
openIfChanged(IndexReader) instead |
static IndexReader |
openIfChanged(IndexReader oldReader,
IndexCommit commit)
If the IndexCommit differs from what the
provided reader is searching, or the provided reader is
not already read-only, open and return a new
readOnly=true reader; else, return null. |
static IndexReader |
openIfChanged(IndexReader oldReader,
IndexWriter writer,
boolean applyAllDeletes)
Expert: If there changes (committed or not) in the
IndexWriter versus what the provided reader is
searching, then open and return a new read-only
IndexReader searching both committed and uncommitted
changes from the writer; else, return null (though, the
current implementation never returns null). |
void |
removeReaderClosedListener(IndexReader.ReaderClosedListener listener)
Expert: remove a previously added
IndexReader.ReaderClosedListener . |
IndexReader |
reopen()
Deprecated.
Use
openIfChanged(IndexReader) instead |
IndexReader |
reopen(boolean openReadOnly)
Deprecated.
Write support will be removed in Lucene 4.0.
Use
openIfChanged(IndexReader) instead |
IndexReader |
reopen(IndexCommit commit)
Deprecated.
Use
openIfChanged(IndexReader,IndexCommit) instead |
IndexReader |
reopen(IndexWriter writer,
boolean applyAllDeletes)
Deprecated.
Use
openIfChanged(IndexReader,IndexWriter,boolean) instead |
void |
setNorm(int doc,
String field,
byte value)
Deprecated.
Write support will be removed in Lucene 4.0.
There will be no replacement for this method.
|
void |
setNorm(int doc,
String field,
float value)
Deprecated.
Write support will be removed in Lucene 4.0.
There will be no replacement for this method.
|
abstract TermDocs |
termDocs()
Returns an unpositioned
TermDocs enumerator. |
TermDocs |
termDocs(Term term)
Returns an enumeration of all the documents which contain
term . |
abstract TermPositions |
termPositions()
Returns an unpositioned
TermPositions enumerator. |
TermPositions |
termPositions(Term term)
Returns an enumeration of all the documents which contain
term . |
abstract TermEnum |
terms()
Returns an enumeration of all the terms in the index.
|
abstract TermEnum |
terms(Term t)
Returns an enumeration of all terms starting at a given term.
|
String |
toString() |
boolean |
tryIncRef()
Expert: increments the refCount of this IndexReader
instance only if the IndexReader has not been closed yet
and returns
true iff the refCount was
successfully incremented, otherwise false . |
void |
undeleteAll()
Deprecated.
Write support will be removed in Lucene 4.0.
There will be no replacement for this method.
|
public final void addReaderClosedListener(IndexReader.ReaderClosedListener listener)
IndexReader.ReaderClosedListener
. The
provided listener will be invoked when this reader is closed.public final void removeReaderClosedListener(IndexReader.ReaderClosedListener listener)
IndexReader.ReaderClosedListener
.public final int getRefCount()
public final void incRef()
decRef()
, in a finally clause;
otherwise the reader may never be closed. Note that
close()
simply calls decRef(), which means that
the IndexReader will not really be closed until decRef()
has been called for all outstanding
references.decRef()
,
tryIncRef()
public final boolean tryIncRef()
true
iff the refCount was
successfully incremented, otherwise false
.
If this method returns false
the reader is either
already closed or is currently been closed. Either way this
reader instance shouldn't be used by an application unless
true
is returned.
RefCounts are used to determine when a
reader can be closed safely, i.e. as soon as there are
no more references. Be sure to always call a
corresponding decRef()
, in a finally clause;
otherwise the reader may never be closed. Note that
close()
simply calls decRef(), which means that
the IndexReader will not really be closed until decRef()
has been called for all outstanding
references.
public final void decRef() throws IOException
IOException
- in case an IOException occurs in commit() or doClose()incRef()
protected final void ensureOpen() throws AlreadyClosedException
AlreadyClosedException
- if this IndexReader is closedpublic static IndexReader open(Directory directory) throws CorruptIndexException, IOException
directory
- the index directoryCorruptIndexException
- if the index is corruptIOException
- if there is a low-level IO error@Deprecated public static IndexReader open(Directory directory, boolean readOnly) throws CorruptIndexException, IOException
open(Directory)
insteaddirectory
- the index directoryreadOnly
- true if no changes (deletions, norms) will be made with this IndexReaderCorruptIndexException
- if the index is corruptIOException
- if there is a low-level IO errorpublic static IndexReader open(IndexWriter writer, boolean applyAllDeletes) throws CorruptIndexException, IOException
IndexWriter
.writer
- The IndexWriter to open fromapplyAllDeletes
- If true, all buffered deletes will
be applied (made visible) in the returned reader. If
false, the deletes are not applied but remain buffered
(in IndexWriter) so that they will be applied in the
future. Applying deletes can be costly, so if your app
can tolerate deleted documents being returned you might
gain some performance by passing false.CorruptIndexException
IOException
- if there is a low-level IO erroropenIfChanged(IndexReader,IndexWriter,boolean)
public static IndexReader open(IndexCommit commit) throws CorruptIndexException, IOException
IndexCommit
.commit
- the commit point to openCorruptIndexException
- if the index is corruptIOException
- if there is a low-level IO error@Deprecated public static IndexReader open(IndexCommit commit, boolean readOnly) throws CorruptIndexException, IOException
open(IndexCommit)
insteadIndexCommit
. You should pass readOnly=true, since it
gives much better concurrent performance, unless you
intend to do write operations (delete documents or
change norms) with the reader.commit
- the commit point to openreadOnly
- true if no changes (deletions, norms) will be made with this IndexReaderCorruptIndexException
- if the index is corruptIOException
- if there is a low-level IO error@Deprecated public static IndexReader open(Directory directory, IndexDeletionPolicy deletionPolicy, boolean readOnly) throws CorruptIndexException, IOException
open(Directory)
insteadIndexDeletionPolicy
. You should pass readOnly=true,
since it gives much better concurrent performance,
unless you intend to do write operations (delete
documents or change norms) with the reader.directory
- the index directorydeletionPolicy
- a custom deletion policy (only used
if you use this reader to perform deletes or to set
norms); see IndexWriter
for details.readOnly
- true if no changes (deletions, norms) will be made with this IndexReaderCorruptIndexException
- if the index is corruptIOException
- if there is a low-level IO error@Deprecated public static IndexReader open(Directory directory, IndexDeletionPolicy deletionPolicy, boolean readOnly, int termInfosIndexDivisor) throws CorruptIndexException, IOException
open(Directory,int)
insteadIndexDeletionPolicy
. You should pass readOnly=true,
since it gives much better concurrent performance,
unless you intend to do write operations (delete
documents or change norms) with the reader.directory
- the index directorydeletionPolicy
- a custom deletion policy (only used
if you use this reader to perform deletes or to set
norms); see IndexWriter
for details.readOnly
- true if no changes (deletions, norms) will be made with this IndexReadertermInfosIndexDivisor
- Subsamples which indexed
terms are loaded into RAM. This has the same effect as IndexWriter.setTermIndexInterval(int)
except that setting
must be done at indexing time while this setting can be
set per reader. When set to N, then one in every
N*termIndexInterval terms in the index is loaded into
memory. By setting this to a value > 1 you can reduce
memory usage, at the expense of higher latency when
loading a TermInfo. The default value is 1. Set this
to -1 to skip loading the terms index entirely.CorruptIndexException
- if the index is corruptIOException
- if there is a low-level IO error@Deprecated public static IndexReader open(IndexCommit commit, IndexDeletionPolicy deletionPolicy, boolean readOnly) throws CorruptIndexException, IOException
open(IndexCommit)
insteadIndexDeletionPolicy
. You should pass
readOnly=true, since it gives much better concurrent
performance, unless you intend to do write operations
(delete documents or change norms) with the reader.commit
- the specific IndexCommit
to open;
see listCommits(org.apache.lucene.store.Directory)
to list all commits
in a directorydeletionPolicy
- a custom deletion policy (only used
if you use this reader to perform deletes or to set
norms); see IndexWriter
for details.readOnly
- true if no changes (deletions, norms) will be made with this IndexReaderCorruptIndexException
- if the index is corruptIOException
- if there is a low-level IO error@Deprecated public static IndexReader open(IndexCommit commit, IndexDeletionPolicy deletionPolicy, boolean readOnly, int termInfosIndexDivisor) throws CorruptIndexException, IOException
open(IndexCommit,int)
insteadIndexDeletionPolicy
. You should pass
readOnly=true, since it gives much better concurrent
performance, unless you intend to do write operations
(delete documents or change norms) with the reader.commit
- the specific IndexCommit
to open;
see listCommits(org.apache.lucene.store.Directory)
to list all commits
in a directorydeletionPolicy
- a custom deletion policy (only used
if you use this reader to perform deletes or to set
norms); see IndexWriter
for details.readOnly
- true if no changes (deletions, norms) will be made with this IndexReadertermInfosIndexDivisor
- Subsamples which indexed
terms are loaded into RAM. This has the same effect as IndexWriter.setTermIndexInterval(int)
except that setting
must be done at indexing time while this setting can be
set per reader. When set to N, then one in every
N*termIndexInterval terms in the index is loaded into
memory. By setting this to a value > 1 you can reduce
memory usage, at the expense of higher latency when
loading a TermInfo. The default value is 1. Set this
to -1 to skip loading the terms index entirely. This is only useful in
advanced situations when you will only .next() through all terms;
attempts to seek will hit an exception.CorruptIndexException
- if the index is corruptIOException
- if there is a low-level IO errorpublic static IndexReader open(Directory directory, int termInfosIndexDivisor) throws CorruptIndexException, IOException
directory
- the index directorytermInfosIndexDivisor
- Subsamples which indexed
terms are loaded into RAM. This has the same effect as IndexWriterConfig.setTermIndexInterval(int)
except that setting
must be done at indexing time while this setting can be
set per reader. When set to N, then one in every
N*termIndexInterval terms in the index is loaded into
memory. By setting this to a value > 1 you can reduce
memory usage, at the expense of higher latency when
loading a TermInfo. The default value is 1. Set this
to -1 to skip loading the terms index entirely.CorruptIndexException
- if the index is corruptIOException
- if there is a low-level IO errorpublic static IndexReader open(IndexCommit commit, int termInfosIndexDivisor) throws CorruptIndexException, IOException
IndexCommit
and termInfosIndexDivisor.commit
- the commit point to opentermInfosIndexDivisor
- Subsamples which indexed
terms are loaded into RAM. This has the same effect as IndexWriterConfig.setTermIndexInterval(int)
except that setting
must be done at indexing time while this setting can be
set per reader. When set to N, then one in every
N*termIndexInterval terms in the index is loaded into
memory. By setting this to a value > 1 you can reduce
memory usage, at the expense of higher latency when
loading a TermInfo. The default value is 1. Set this
to -1 to skip loading the terms index entirely.CorruptIndexException
- if the index is corruptIOException
- if there is a low-level IO errorpublic static IndexReader openIfChanged(IndexReader oldReader) throws IOException
This method is typically far less costly than opening a
fully new IndexReader
as it shares
resources (for example sub-readers) with the provided
IndexReader
, when possible.
The provided reader is not closed (you are responsible
for doing so); if a new reader is returned you also
must eventually close it. Be sure to never close a
reader while other threads are still using it; see
SearcherManager
to simplify managing this.
If a new reader is returned, it's safe to make changes (deletions, norms) with it. All shared mutable state with the old reader uses "copy on write" semantics to ensure the changes are not seen by other readers.
CorruptIndexException
- if the index is corruptIOException
- if there is a low-level IO error@Deprecated public static IndexReader openIfChanged(IndexReader oldReader, boolean readOnly) throws IOException
openIfChanged(IndexReader)
insteadreadOnly
; else, return
null.IOException
openIfChanged(IndexReader)
public static IndexReader openIfChanged(IndexReader oldReader, IndexCommit commit) throws IOException
readOnly=true
reader; else, return null.IOException
openIfChanged(IndexReader)
public static IndexReader openIfChanged(IndexReader oldReader, IndexWriter writer, boolean applyAllDeletes) throws IOException
IndexWriter
versus what the provided reader is
searching, then open and return a new read-only
IndexReader searching both committed and uncommitted
changes from the writer; else, return null (though, the
current implementation never returns null).
This provides "near real-time" searching, in that
changes made during an IndexWriter
session can be
quickly made available for searching without closing
the writer nor calling IndexWriter.commit()
.
It's near real-time because there is no hard guarantee on how quickly you can get a new reader after making changes with IndexWriter. You'll have to experiment in your situation to determine if it's fast enough. As this is a new and experimental feature, please report back on your findings so we can learn, improve and iterate.
The very first time this method is called, this writer instance will make every effort to pool the readers that it opens for doing merges, applying deletes, etc. This means additional resources (RAM, file descriptors, CPU time) will be consumed.
For lower latency on reopening a reader, you should
call IndexWriterConfig.setMergedSegmentWarmer(org.apache.lucene.index.IndexWriter.IndexReaderWarmer)
to
pre-warm a newly merged segment before it's committed
to the index. This is important for minimizing
index-to-search delay after a large merge.
If an addIndexes* call is running in another thread, then this reader will only search those segments from the foreign index that have been successfully copied over, so far.
NOTE: Once the writer is closed, any
outstanding readers may continue to be used. However,
if you attempt to reopen any of those readers, you'll
hit an AlreadyClosedException
.
writer
- The IndexWriter to open fromapplyAllDeletes
- If true, all buffered deletes will
be applied (made visible) in the returned reader. If
false, the deletes are not applied but remain buffered
(in IndexWriter) so that they will be applied in the
future. Applying deletes can be costly, so if your app
can tolerate deleted documents being returned you might
gain some performance by passing false.IOException
@Deprecated public IndexReader reopen() throws CorruptIndexException, IOException
openIfChanged(IndexReader)
insteadOpening an IndexReader is an expensive operation. This method can be used to refresh an existing IndexReader to reduce these costs. This method tries to only load segments that have changed or were created after the IndexReader was (re)opened.
If the index has not changed since this instance was (re)opened, then this
call is a NOOP and returns this instance. Otherwise, a new instance is
returned. The old instance is not closed and remains usable.
If the reader is reopened, even though they share resources internally, it's safe to make changes (deletions, norms) with the new reader. All shared mutable state obeys "copy on write" semantics to ensure the changes are not seen by other readers.
You can determine whether a reader was actually reopened by comparing the old instance with the instance returned by this method:
IndexReader reader = ... ... IndexReader newReader = r.reopen(); if (newReader != reader) { ... // reader was reopened reader.close(); } reader = newReader; ...Be sure to synchronize that code so that other threads, if present, can never use reader after it has been closed and before it's switched to newReader.
NOTE: If this reader is a near real-time
reader (obtained from IndexWriter.getReader()
,
reopen() will simply call writer.getReader() again for
you, though this may change in the future.
CorruptIndexException
- if the index is corruptIOException
- if there is a low-level IO error@Deprecated public IndexReader reopen(boolean openReadOnly) throws CorruptIndexException, IOException
openIfChanged(IndexReader)
insteadreopen()
, except you can change the
readOnly of the original reader. If the index is
unchanged but readOnly is different then a new reader
will be returned.CorruptIndexException
IOException
@Deprecated public IndexReader reopen(IndexCommit commit) throws CorruptIndexException, IOException
openIfChanged(IndexReader,IndexCommit)
insteadCorruptIndexException
IOException
@Deprecated public IndexReader reopen(IndexWriter writer, boolean applyAllDeletes) throws CorruptIndexException, IOException
openIfChanged(IndexReader,IndexWriter,boolean)
insteadcommit()
.
Note that this is functionally equivalent to calling
{#flush} (an internal IndexWriter operation) and then using open(org.apache.lucene.store.Directory)
to
open a new reader. But the turnaround time of this
method should be faster since it avoids the potentially
costly commit()
.
You must close the IndexReader
returned by
this method once you are done using it.
It's near real-time because there is no hard guarantee on how quickly you can get a new reader after making changes with IndexWriter. You'll have to experiment in your situation to determine if it's fast enough. As this is a new and experimental feature, please report back on your findings so we can learn, improve and iterate.
The resulting reader supports reopen()
, but that call will simply forward
back to this method (though this may change in the
future).
The very first time this method is called, this writer instance will make every effort to pool the readers that it opens for doing merges, applying deletes, etc. This means additional resources (RAM, file descriptors, CPU time) will be consumed.
For lower latency on reopening a reader, you should
call IndexWriterConfig.setMergedSegmentWarmer(org.apache.lucene.index.IndexWriter.IndexReaderWarmer)
to
pre-warm a newly merged segment before it's committed
to the index. This is important for minimizing
index-to-search delay after a large merge.
If an addIndexes* call is running in another thread, then this reader will only search those segments from the foreign index that have been successfully copied over, so far
.NOTE: Once the writer is closed, any
outstanding readers may continue to be used. However,
if you attempt to reopen any of those readers, you'll
hit an AlreadyClosedException
.
writer
- The IndexWriter to open fromapplyAllDeletes
- If true, all buffered deletes will
be applied (made visible) in the returned reader. If
false, the deletes are not applied but remain buffered
(in IndexWriter) so that they will be applied in the
future. Applying deletes can be costly, so if your app
can tolerate deleted documents being returned you might
gain some performance by passing false.IOException
CorruptIndexException
protected IndexReader doOpenIfChanged() throws CorruptIndexException, IOException
null
.CorruptIndexException
IOException
openIfChanged(IndexReader)
@Deprecated protected IndexReader doOpenIfChanged(boolean openReadOnly) throws CorruptIndexException, IOException
doOpenIfChanged()
insteadnull
.CorruptIndexException
IOException
openIfChanged(IndexReader, boolean)
protected IndexReader doOpenIfChanged(IndexCommit commit) throws CorruptIndexException, IOException
null
.CorruptIndexException
IOException
openIfChanged(IndexReader, IndexCommit)
protected IndexReader doOpenIfChanged(IndexWriter writer, boolean applyAllDeletes) throws CorruptIndexException, IOException
null
.public Object clone()
On cloning a reader with pending changes (deletions, norms), the original reader transfers its write lock to the cloned reader. This means only the cloned reader may make further changes to the index, and commit the changes to the index on close, but the old reader still reflects all changes made up until it was cloned.
Like openIfChanged(IndexReader)
, it's safe to make changes to
either the original or the cloned reader: all shared
mutable state obeys "copy on write" semantics to ensure
the changes are not seen by other readers.
@Deprecated public IndexReader clone(boolean openReadOnly) throws CorruptIndexException, IOException
clone()
instead.CorruptIndexException
- if the index is corruptIOException
- if there is a low-level IO errorpublic Directory directory()
UnsupportedOperationException
- if no directory@Deprecated public static long lastModified(Directory directory2) throws CorruptIndexException, IOException
IndexWriter.commit(Map)
isCurrent()
instead.CorruptIndexException
- if the index is corruptIOException
- if there is a low-level IO error@Deprecated public static long getCurrentVersion(Directory directory) throws CorruptIndexException, IOException
getVersion()
on an opened IndexReader.directory
- where the index resides.CorruptIndexException
- if the index is corruptIOException
- if there is a low-level IO error@Deprecated public static Map<String,String> getCommitUserData(Directory directory) throws CorruptIndexException, IOException
IndexWriter.commit(Map)
, from current index
segments file. This will return null if IndexWriter.commit(Map)
has never been called for
this index.directory
- where the index resides.CorruptIndexException
- if the index is corruptIOException
- if there is a low-level IO errorpublic long getVersion()
If this reader is based on a Directory (ie, was
created by calling open(org.apache.lucene.store.Directory)
, or openIfChanged(org.apache.lucene.index.IndexReader)
on
a reader based on a Directory), then this method
returns the version recorded in the commit that the
reader opened. This version is advanced every time
IndexWriter.commit()
is called.
UnsupportedOperationException
- unless overridden in subclass@Deprecated public Map<String,String> getCommitUserData()
IndexWriter.commit(Map)
has never been called for
this index.public boolean isCurrent() throws CorruptIndexException, IOException
If this reader is based on a Directory (ie, was
created by calling open(org.apache.lucene.store.Directory)
, or openIfChanged(org.apache.lucene.index.IndexReader)
on
a reader based on a Directory), then this method checks
if any further commits (see IndexWriter.commit()
have occurred in that directory).
If instead this reader is a near real-time reader
(ie, obtained by a call to IndexWriter.getReader()
, or by calling openIfChanged(org.apache.lucene.index.IndexReader)
on a near real-time reader), then this method checks if
either a new commmit has occurred, or any new
uncommitted changes have taken place via the writer.
Note that even if the writer has only performed
merging, this method will still return false.
In any event, if this returns false, you should call
openIfChanged(org.apache.lucene.index.IndexReader)
to get a new reader that sees the
changes.
CorruptIndexException
- if the index is corruptIOException
- if there is a low-level IO errorUnsupportedOperationException
- unless overridden in subclass@Deprecated public boolean isOptimized()
getSequentialSubReaders()
instead.public abstract TermFreqVector[] getTermFreqVectors(int docNumber) throws IOException
TermFreqVector
or of type TermPositionVector
if
positions or offsets have been stored.docNumber
- document for which term frequency vectors are returnedIOException
- if index cannot be accessedField.TermVector
public abstract TermFreqVector getTermFreqVector(int docNumber, String field) throws IOException
TermPositionVector
is returned.docNumber
- document for which the term frequency vector is returnedfield
- field for which the term frequency vector is returned.IOException
- if index cannot be accessedField.TermVector
public abstract void getTermFreqVector(int docNumber, String field, TermVectorMapper mapper) throws IOException
TermFreqVector
.docNumber
- The number of the document to load the vector forfield
- The name of the field to loadmapper
- The TermVectorMapper
to process the vector. Must not be nullIOException
- if term vectors cannot be accessed or if they do not exist on the field and doc. specified.public abstract void getTermFreqVector(int docNumber, TermVectorMapper mapper) throws IOException
docNumber
- The number of the document to load the vector formapper
- The TermVectorMapper
to process the vector. Must not be nullIOException
- if term vectors cannot be accessed or if they do not exist on the field and doc. specified.public static boolean indexExists(Directory directory) throws IOException
true
if an index exists at the specified directory.directory
- the directory to check for an indextrue
if an index exists; false
otherwiseIOException
- if there is a problem with accessing the indexpublic abstract int numDocs()
public abstract int maxDoc()
public final int numDeletedDocs()
public final Document document(int n) throws CorruptIndexException, IOException
n
th
Document
in this index.
NOTE: for performance reasons, this method does not check if the
requested document is deleted, and therefore asking for a deleted document
may yield unspecified results. Usually this is not required, however you
can call isDeleted(int)
with the requested document ID to verify
the document is not deleted.
CorruptIndexException
- if the index is corruptIOException
- if there is a low-level IO errorpublic abstract Document document(int n, FieldSelector fieldSelector) throws CorruptIndexException, IOException
Document
at the n
th position. The FieldSelector
may be used to determine
what Field
s to load and how they should
be loaded. NOTE: If this Reader (more specifically, the underlying
FieldsReader
) is closed before the lazy
Field
is loaded an exception may be
thrown. If you want the value of a lazy
Field
to be available after closing you
must explicitly load it or fetch the Document again with a new loader.
NOTE: for performance reasons, this method does not check if the
requested document is deleted, and therefore asking for a deleted document
may yield unspecified results. Usually this is not required, however you
can call isDeleted(int)
with the requested document ID to verify
the document is not deleted.
n
- Get the document at the n
th positionfieldSelector
- The FieldSelector
to use to determine what
Fields should be loaded on the Document. May be null, in which case
all Fields will be loaded.Document
at the nth positionCorruptIndexException
- if the index is corruptIOException
- if there is a low-level IO errorFieldable
,
FieldSelector
,
SetBasedFieldSelector
,
LoadFirstFieldSelector
public abstract boolean isDeleted(int n)
public abstract boolean hasDeletions()
public boolean hasNorms(String field) throws IOException
IOException
public abstract byte[] norms(String field) throws IOException
IOException
AbstractField.setBoost(float)
public abstract void norms(String field, byte[] bytes, int offset) throws IOException
IOException
AbstractField.setBoost(float)
@Deprecated public final void setNorm(int doc, String field, byte value) throws StaleReaderException, CorruptIndexException, LockObtainFailedException, IOException
boost
and its length normalization
. Thus, to preserve the length normalization
values when resetting this, one should base the new value upon the old.
NOTE: If this field does not index norms, then
this method throws IllegalStateException
.StaleReaderException
- if the index has changed
since this reader was openedCorruptIndexException
- if the index is corruptLockObtainFailedException
- if another writer
has this index open (write.lock
could not
be obtained)IOException
- if there is a low-level IO errorIllegalStateException
- if the field does not index normsnorms(String)
,
Similarity.decodeNormValue(byte)
@Deprecated protected abstract void doSetNorm(int doc, String field, byte value) throws CorruptIndexException, IOException
CorruptIndexException
IOException
@Deprecated public final void setNorm(int doc, String field, float value) throws StaleReaderException, CorruptIndexException, LockObtainFailedException, IOException
StaleReaderException
- if the index has changed
since this reader was openedCorruptIndexException
- if the index is corruptLockObtainFailedException
- if another writer
has this index open (write.lock
could not
be obtained)IOException
- if there is a low-level IO errornorms(String)
,
Similarity.decodeNormValue(byte)
public abstract TermEnum terms() throws IOException
TermEnum.next()
must be called
on the resulting enumeration before calling other methods such as
TermEnum.term()
.IOException
- if there is a low-level IO errorpublic abstract TermEnum terms(Term t) throws IOException
IOException
- if there is a low-level IO errorpublic abstract int docFreq(Term t) throws IOException
t
.IOException
- if there is a low-level IO errorpublic TermDocs termDocs(Term term) throws IOException
term
. For each document, the document number, the frequency of
the term in that document is also provided, for use in
search scoring. If term is null, then all non-deleted
docs are returned with freq=1.
Thus, this method implements the mapping:
The enumeration is ordered by document number. Each document number is greater than all that precede it in the enumeration.
IOException
- if there is a low-level IO errorpublic abstract TermDocs termDocs() throws IOException
TermDocs
enumerator.
Note: the TermDocs returned is unpositioned. Before using it, ensure
that you first position it with TermDocs.seek(Term)
or
TermDocs.seek(TermEnum)
.
IOException
- if there is a low-level IO errorpublic final TermPositions termPositions(Term term) throws IOException
term
. For each document, in addition to the document number
and frequency of the term in that document, a list of all of the ordinal
positions of the term in the document is available. Thus, this method
implements the mapping:
This positional information facilitates phrase and proximity searching.
The enumeration is ordered by document number. Each document number is greater than all that precede it in the enumeration.
IOException
- if there is a low-level IO errorpublic abstract TermPositions termPositions() throws IOException
TermPositions
enumerator.IOException
- if there is a low-level IO error@Deprecated public final void deleteDocument(int docNum) throws StaleReaderException, CorruptIndexException, LockObtainFailedException, IOException
IndexWriter.deleteDocuments(Term)
insteaddocNum
. Once a document is
deleted it will not appear in TermDocs or TermPostitions enumerations.
Attempts to read its field with the document(int)
method will result in an error. The presence of this document may still be
reflected in the docFreq(org.apache.lucene.index.Term)
statistic, though
this will be corrected eventually as the index is further modified.StaleReaderException
- if the index has changed
since this reader was openedCorruptIndexException
- if the index is corruptLockObtainFailedException
- if another writer
has this index open (write.lock
could not
be obtained)IOException
- if there is a low-level IO error@Deprecated protected abstract void doDelete(int docNum) throws CorruptIndexException, IOException
IndexWriter.deleteDocuments(Term)
insteaddocNum
.
Applications should call deleteDocument(int)
or deleteDocuments(Term)
.CorruptIndexException
IOException
@Deprecated public final int deleteDocuments(Term term) throws StaleReaderException, CorruptIndexException, LockObtainFailedException, IOException
IndexWriter.deleteDocuments(Term)
insteadterm
indexed.
This is useful if one uses a document field to hold a unique ID string for
the document. Then to delete such a document, one merely constructs a
term with the appropriate field and the unique ID string as its text and
passes it to this method.
See deleteDocument(int)
for information about when this deletion will
become effective.StaleReaderException
- if the index has changed
since this reader was openedCorruptIndexException
- if the index is corruptLockObtainFailedException
- if another writer
has this index open (write.lock
could not
be obtained)IOException
- if there is a low-level IO error@Deprecated public final void undeleteAll() throws StaleReaderException, CorruptIndexException, LockObtainFailedException, IOException
NOTE: this method can only recover documents marked
for deletion but not yet removed from the index; when
and how Lucene removes deleted documents is an
implementation detail, subject to change from release
to release. However, you can use numDeletedDocs()
on the current IndexReader instance to
see how many documents will be un-deleted.
StaleReaderException
- if the index has changed
since this reader was openedLockObtainFailedException
- if another writer
has this index open (write.lock
could not
be obtained)CorruptIndexException
- if the index is corruptIOException
- if there is a low-level IO error@Deprecated protected abstract void doUndeleteAll() throws CorruptIndexException, IOException
CorruptIndexException
IOException
@Deprecated protected void acquireWriteLock() throws IOException
IOException
@Deprecated public final void flush() throws IOException
IOException
@Deprecated public final void flush(Map<String,String> commitUserData) throws IOException
commitUserData
- Opaque Map (String -> String)
that's recorded into the segments file in the index,
and retrievable by IndexCommit.getUserData()
.IOException
@Deprecated protected final void commit() throws IOException
IOException
- if there is a low-level IO error@Deprecated public final void commit(Map<String,String> commitUserData) throws IOException
IOException
- if there is a low-level IO error@Deprecated protected abstract void doCommit(Map<String,String> commitUserData) throws IOException
IOException
public final void close() throws IOException
close
in interface Closeable
IOException
- if there is a low-level IO errorprotected abstract void doClose() throws IOException
IOException
public abstract FieldInfos getFieldInfos()
FieldInfos
describing all fields in
this reader. NOTE: do not make any changes to the
returned FieldInfos!public IndexCommit getIndexCommit() throws IOException
IOException
public static Collection<IndexCommit> listCommits(Directory dir) throws IOException
KeepOnlyLastCommitDeletionPolicy
, there would be only
one commit point. But if you're using a custom IndexDeletionPolicy
then there could be many commits.
Once you have a given commit, you can open a reader on
it by calling open(IndexCommit,boolean)
There must be at least one commit in
the Directory, else this method throws IndexNotFoundException
. Note that if a commit is in
progress while this method is running, that commit
may or may not be returned.IndexCommit
s, from oldest
to latest.IOException
public IndexReader[] getSequentialSubReaders()
NOTE: You should not try using sub-readers returned by
this method to make any changes (setNorm, deleteDocument,
etc.). While this might succeed for one composite reader
(like MultiReader), it will most likely lead to index
corruption for other readers (like DirectoryReader obtained
through open(org.apache.lucene.store.Directory)
. Use the parent reader directly.
public Object getCoreCacheKey()
public Object getDeletesCacheKey()
public long getUniqueTermCount() throws IOException
UnsupportedOperationException
- if this count
cannot be easily determined (eg Multi*Readers).
Instead, you should call getSequentialSubReaders()
and ask each sub reader for
its unique term count.IOException
public int getTermInfosIndexDivisor()