|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.lucene.index.IndexReader
public abstract class IndexReader
IndexReader is an abstract class, providing an interface for accessing an index. Search of an index is done entirely through this abstract interface, so that any subclass which implements it is searchable.
Concrete subclasses of IndexReader are usually constructed with a call to
one of the static open()
methods, e.g. open(Directory, boolean)
.
For efficiency, in this API documents are often referred to via document numbers, non-negative integers which each name a unique document in the index. These document numbers are ephemeral--they may change as documents are added to and deleted from an index. Clients should thus not rely on a given document having the same number between sessions.
An IndexReader can be opened on a directory for which an IndexWriter is opened already, but it cannot be used to delete documents from the index then.
NOTE: for backwards API compatibility, several methods are not listed as abstract, but have no useful implementations in this base class and instead always throw UnsupportedOperationException. Subclasses are strongly encouraged to override these methods, but in many cases may not need to.
NOTE: as of 2.4, it's possible to open a read-only IndexReader using the static open methods that accept the boolean readOnly parameter. Such a reader has better concurrency as it's not necessary to synchronize on the isDeleted method. You must specify false if you want to make changes with the resulting IndexReader.
NOTE: IndexReader
instances are completely thread
safe, meaning multiple threads can call any of its methods,
concurrently. If your application requires external
synchronization, you should not synchronize on the
IndexReader
instance; use your own
(non-Lucene) objects instead.
Nested Class Summary | |
---|---|
static class |
IndexReader.FieldOption
Constants describing field properties, for example used for getFieldNames(FieldOption) . |
static interface |
IndexReader.ReaderFinishedListener
A custom listener that's invoked when the IndexReader is finished. |
Field Summary | |
---|---|
protected boolean |
hasChanges
|
protected Collection<IndexReader.ReaderFinishedListener> |
readerFinishedListeners
|
Constructor Summary | |
---|---|
protected |
IndexReader()
|
Method Summary | |
---|---|
protected void |
acquireWriteLock()
Does nothing by default. |
void |
addReaderFinishedListener(IndexReader.ReaderFinishedListener listener)
Expert: adds a IndexReader.ReaderFinishedListener . |
Object |
clone()
Efficiently clones the IndexReader (sharing most internal state). |
IndexReader |
clone(boolean openReadOnly)
Clones the IndexReader and optionally changes readOnly. |
void |
close()
Closes files associated with this index. |
protected void |
commit()
Commit changes resulting from delete, undeleteAll, or setNorm operations If an exception is hit, then either no changes or all changes will have been committed to the index (transactional semantics). |
void |
commit(Map<String,String> commitUserData)
Commit changes resulting from delete, undeleteAll, or setNorm operations If an exception is hit, then either no changes or all changes will have been committed to the index (transactional semantics). |
void |
decRef()
Expert: decreases the refCount of this IndexReader instance. |
void |
deleteDocument(int docNum)
Deletes the document numbered docNum . |
int |
deleteDocuments(Term term)
Deletes all documents that have a given term indexed. |
Directory |
directory()
Returns the directory associated with this index. |
abstract int |
docFreq(Term t)
Returns the number of documents containing the term t . |
protected abstract void |
doClose()
Implements close. |
protected abstract void |
doCommit(Map<String,String> commitUserData)
Implements commit. |
Document |
document(int n)
Returns the stored fields of the n th
Document in this index. |
abstract Document |
document(int n,
FieldSelector fieldSelector)
Get the Document at the n
th position. |
protected abstract void |
doDelete(int docNum)
Implements deletion of the document numbered docNum . |
protected IndexReader |
doOpenIfChanged()
If the index has changed since it was opened, open and return a new reader; else, return null . |
protected IndexReader |
doOpenIfChanged(boolean openReadOnly)
If the index has changed since it was opened, open and return a new reader; else, return null . |
protected IndexReader |
doOpenIfChanged(IndexCommit commit)
If the index has changed since it was opened, open and return a new reader; else, return null . |
protected IndexReader |
doOpenIfChanged(IndexWriter writer,
boolean applyAllDeletes)
If the index has changed since it was opened, open and return a new reader; else, return null . |
protected abstract void |
doSetNorm(int doc,
String field,
byte value)
Implements setNorm in subclass. |
protected abstract void |
doUndeleteAll()
Implements actual undeleteAll() in subclass. |
protected void |
ensureOpen()
|
void |
flush()
|
void |
flush(Map<String,String> commitUserData)
|
Map<String,String> |
getCommitUserData()
Retrieve the String userData optionally passed to IndexWriter#commit. |
static Map<String,String> |
getCommitUserData(Directory directory)
Reads commitUserData, previously passed to IndexWriter.commit(Map) , from current index
segments file. |
Object |
getCoreCacheKey()
Expert |
static long |
getCurrentVersion(Directory directory)
Reads version number from segments files. |
Object |
getDeletesCacheKey()
Expert. |
abstract Collection<String> |
getFieldNames(IndexReader.FieldOption fldOption)
Get a list of unique field names that exist in this index and have the specified field option information. |
IndexCommit |
getIndexCommit()
Expert: return the IndexCommit that this reader has opened. |
int |
getRefCount()
Expert: returns the current refCount for this reader |
IndexReader[] |
getSequentialSubReaders()
Expert: returns the sequential sub readers that this reader is logically composed of. |
abstract TermFreqVector |
getTermFreqVector(int docNumber,
String field)
Return a term frequency vector for the specified document and field. |
abstract void |
getTermFreqVector(int docNumber,
String field,
TermVectorMapper mapper)
Load the Term Vector into a user-defined data structure instead of relying on the parallel arrays of the TermFreqVector . |
abstract void |
getTermFreqVector(int docNumber,
TermVectorMapper mapper)
Map all the term vectors for all fields in a Document |
abstract TermFreqVector[] |
getTermFreqVectors(int docNumber)
Return an array of term frequency vectors for the specified document. |
int |
getTermInfosIndexDivisor()
For IndexReader implementations that use TermInfosReader to read terms, this returns the current indexDivisor as specified when the reader was opened. |
long |
getUniqueTermCount()
Returns the number of unique terms (across all fields) in this reader. |
long |
getVersion()
Version number when this IndexReader was opened. |
abstract boolean |
hasDeletions()
Returns true if any documents have been deleted |
boolean |
hasNorms(String field)
Returns true if there are norms stored for this field. |
void |
incRef()
Expert: increments the refCount of this IndexReader instance. |
static boolean |
indexExists(Directory directory)
Returns true if an index exists at the specified directory. |
boolean |
isCurrent()
Check whether any new changes have occurred to the index since this reader was opened. |
abstract boolean |
isDeleted(int n)
Returns true if document n has been deleted |
boolean |
isOptimized()
Deprecated. Check segment count using getSequentialSubReaders() instead. |
static long |
lastModified(Directory directory2)
Returns the time the index in the named directory was last modified. |
static Collection<IndexCommit> |
listCommits(Directory dir)
Returns all commit points that exist in the Directory. |
static void |
main(String[] args)
Prints the filename and size of each file within a given compound file. |
abstract int |
maxDoc()
Returns one greater than the largest possible document number. |
abstract byte[] |
norms(String field)
Returns the byte-encoded normalization factor for the named field of every document. |
abstract void |
norms(String field,
byte[] bytes,
int offset)
Reads the byte-encoded normalization factor for the named field of every document. |
protected void |
notifyReaderFinishedListeners()
|
int |
numDeletedDocs()
Returns the number of deleted documents. |
abstract int |
numDocs()
Returns the number of documents in this index. |
static IndexReader |
open(Directory directory)
Returns a IndexReader reading the index in the given Directory, with readOnly=true. |
static IndexReader |
open(Directory directory,
boolean readOnly)
Returns an IndexReader reading the index in the given Directory. |
static IndexReader |
open(Directory directory,
IndexDeletionPolicy deletionPolicy,
boolean readOnly)
Expert: returns an IndexReader reading the index in the given Directory, with a custom IndexDeletionPolicy . |
static IndexReader |
open(Directory directory,
IndexDeletionPolicy deletionPolicy,
boolean readOnly,
int termInfosIndexDivisor)
Expert: returns an IndexReader reading the index in the given Directory, with a custom IndexDeletionPolicy . |
static IndexReader |
open(IndexCommit commit,
boolean readOnly)
Expert: returns an IndexReader reading the index in the given IndexCommit . |
static IndexReader |
open(IndexCommit commit,
IndexDeletionPolicy deletionPolicy,
boolean readOnly)
Expert: returns an IndexReader reading the index in the given Directory, using a specific commit and with a custom IndexDeletionPolicy . |
static IndexReader |
open(IndexCommit commit,
IndexDeletionPolicy deletionPolicy,
boolean readOnly,
int termInfosIndexDivisor)
Expert: returns an IndexReader reading the index in the given Directory, using a specific commit and with a custom IndexDeletionPolicy . |
static IndexReader |
open(IndexWriter writer,
boolean applyAllDeletes)
Open a near real time IndexReader from the IndexWriter . |
static IndexReader |
openIfChanged(IndexReader oldReader)
If the index has changed since the provided reader was opened, open and return a new reader; else, return null. |
static IndexReader |
openIfChanged(IndexReader oldReader,
boolean readOnly)
If the index has changed since the provided reader was opened, open and return a new reader, with the specified readOnly ; else, return
null. |
static IndexReader |
openIfChanged(IndexReader oldReader,
IndexCommit commit)
If the IndexCommit differs from what the provided reader is searching, or the provided reader is not already read-only, open and return a new readOnly=true reader; else, return null. |
static IndexReader |
openIfChanged(IndexReader oldReader,
IndexWriter writer,
boolean applyAllDeletes)
Expert: If there changes (committed or not) in the IndexWriter versus what the provided reader is
searching, then open and return a new read-only
IndexReader searching both committed and uncommitted
changes from the writer; else, return null (though, the
current implementation never returns null). |
protected void |
readerFinished()
|
void |
removeReaderFinishedListener(IndexReader.ReaderFinishedListener listener)
Expert: remove a previously added IndexReader.ReaderFinishedListener . |
IndexReader |
reopen()
Deprecated. Use IndexReader#openIfChanged(IndexReader) instead |
IndexReader |
reopen(boolean openReadOnly)
Deprecated. Use IndexReader#openIfChanged(IndexReader,boolean) instead |
IndexReader |
reopen(IndexCommit commit)
Deprecated. Use IndexReader#openIfChanged(IndexReader,IndexCommit) instead |
IndexReader |
reopen(IndexWriter writer,
boolean applyAllDeletes)
Deprecated. Use IndexReader#openIfChanged(IndexReader,IndexReader,boolean) instead |
void |
setNorm(int doc,
String field,
byte value)
Expert: Resets the normalization factor for the named field of the named document. |
void |
setNorm(int doc,
String field,
float value)
Deprecated. Use setNorm(int, String, byte) instead, encoding the
float to byte with your Similarity's Similarity.encodeNormValue(float) .
This method will be removed in Lucene 4.0 |
abstract TermDocs |
termDocs()
Returns an unpositioned TermDocs enumerator. |
TermDocs |
termDocs(Term term)
Returns an enumeration of all the documents which contain term . |
abstract TermPositions |
termPositions()
Returns an unpositioned TermPositions enumerator. |
TermPositions |
termPositions(Term term)
Returns an enumeration of all the documents which contain term . |
abstract TermEnum |
terms()
Returns an enumeration of all the terms in the index. |
abstract TermEnum |
terms(Term t)
Returns an enumeration of all terms starting at a given term. |
String |
toString()
|
boolean |
tryIncRef()
Expert: increments the refCount of this IndexReader instance only if the IndexReader has not been closed yet and returns true iff the refCount was
successfully incremented, otherwise false . |
void |
undeleteAll()
Undeletes all documents currently marked as deleted in this index. |
Methods inherited from class java.lang.Object |
---|
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
protected volatile Collection<IndexReader.ReaderFinishedListener> readerFinishedListeners
protected boolean hasChanges
Constructor Detail |
---|
protected IndexReader()
Method Detail |
---|
public void addReaderFinishedListener(IndexReader.ReaderFinishedListener listener)
IndexReader.ReaderFinishedListener
. The
provided listener is also added to any sub-readers, if
this is a composite reader. Also, any reader reopened
or cloned from this one will also copy the listeners at
the time of reopen.
public void removeReaderFinishedListener(IndexReader.ReaderFinishedListener listener)
IndexReader.ReaderFinishedListener
.
protected void notifyReaderFinishedListeners()
protected void readerFinished()
public int getRefCount()
public void incRef()
decRef()
, in a finally clause;
otherwise the reader may never be closed. Note that
close()
simply calls decRef(), which means that
the IndexReader will not really be closed until decRef()
has been called for all outstanding
references.
decRef()
,
tryIncRef()
public boolean tryIncRef()
true
iff the refCount was
successfully incremented, otherwise false
.
If this method returns false
the reader is either
already closed or is currently been closed. Either way this
reader instance shouldn't be used by an application unless
true
is returned.
RefCounts are used to determine when a
reader can be closed safely, i.e. as soon as there are
no more references. Be sure to always call a
corresponding decRef()
, in a finally clause;
otherwise the reader may never be closed. Note that
close()
simply calls decRef(), which means that
the IndexReader will not really be closed until decRef()
has been called for all outstanding
references.
decRef()
,
incRef()
public String toString()
toString
in class Object
public void decRef() throws IOException
IOException
- in case an IOException occurs in commit() or doClose()incRef()
protected final void ensureOpen() throws AlreadyClosedException
AlreadyClosedException
- if this IndexReader is closedpublic static IndexReader open(Directory directory) throws CorruptIndexException, IOException
directory
- the index directory
CorruptIndexException
- if the index is corrupt
IOException
- if there is a low-level IO errorpublic static IndexReader open(Directory directory, boolean readOnly) throws CorruptIndexException, IOException
directory
- the index directoryreadOnly
- true if no changes (deletions, norms) will be made with this IndexReader
CorruptIndexException
- if the index is corrupt
IOException
- if there is a low-level IO errorpublic static IndexReader open(IndexWriter writer, boolean applyAllDeletes) throws CorruptIndexException, IOException
IndexWriter
.
writer
- The IndexWriter to open fromapplyAllDeletes
- If true, all buffered deletes will
be applied (made visible) in the returned reader. If
false, the deletes are not applied but remain buffered
(in IndexWriter) so that they will be applied in the
future. Applying deletes can be costly, so if your app
can tolerate deleted documents being returned you might
gain some performance by passing false.
CorruptIndexException
IOException
- if there is a low-level IO erroropenIfChanged(IndexReader,IndexWriter,boolean)
public static IndexReader open(IndexCommit commit, boolean readOnly) throws CorruptIndexException, IOException
IndexCommit
. You should pass readOnly=true, since it
gives much better concurrent performance, unless you
intend to do write operations (delete documents or
change norms) with the reader.
commit
- the commit point to openreadOnly
- true if no changes (deletions, norms) will be made with this IndexReader
CorruptIndexException
- if the index is corrupt
IOException
- if there is a low-level IO errorpublic static IndexReader open(Directory directory, IndexDeletionPolicy deletionPolicy, boolean readOnly) throws CorruptIndexException, IOException
IndexDeletionPolicy
. You should pass readOnly=true,
since it gives much better concurrent performance,
unless you intend to do write operations (delete
documents or change norms) with the reader.
directory
- the index directorydeletionPolicy
- a custom deletion policy (only used
if you use this reader to perform deletes or to set
norms); see IndexWriter
for details.readOnly
- true if no changes (deletions, norms) will be made with this IndexReader
CorruptIndexException
- if the index is corrupt
IOException
- if there is a low-level IO errorpublic static IndexReader open(Directory directory, IndexDeletionPolicy deletionPolicy, boolean readOnly, int termInfosIndexDivisor) throws CorruptIndexException, IOException
IndexDeletionPolicy
. You should pass readOnly=true,
since it gives much better concurrent performance,
unless you intend to do write operations (delete
documents or change norms) with the reader.
directory
- the index directorydeletionPolicy
- a custom deletion policy (only used
if you use this reader to perform deletes or to set
norms); see IndexWriter
for details.readOnly
- true if no changes (deletions, norms) will be made with this IndexReadertermInfosIndexDivisor
- Subsamples which indexed
terms are loaded into RAM. This has the same effect as IndexWriter.setTermIndexInterval(int)
except that setting
must be done at indexing time while this setting can be
set per reader. When set to N, then one in every
N*termIndexInterval terms in the index is loaded into
memory. By setting this to a value > 1 you can reduce
memory usage, at the expense of higher latency when
loading a TermInfo. The default value is 1. Set this
to -1 to skip loading the terms index entirely.
CorruptIndexException
- if the index is corrupt
IOException
- if there is a low-level IO errorpublic static IndexReader open(IndexCommit commit, IndexDeletionPolicy deletionPolicy, boolean readOnly) throws CorruptIndexException, IOException
IndexDeletionPolicy
. You should pass
readOnly=true, since it gives much better concurrent
performance, unless you intend to do write operations
(delete documents or change norms) with the reader.
commit
- the specific IndexCommit
to open;
see listCommits(org.apache.lucene.store.Directory)
to list all commits
in a directorydeletionPolicy
- a custom deletion policy (only used
if you use this reader to perform deletes or to set
norms); see IndexWriter
for details.readOnly
- true if no changes (deletions, norms) will be made with this IndexReader
CorruptIndexException
- if the index is corrupt
IOException
- if there is a low-level IO errorpublic static IndexReader open(IndexCommit commit, IndexDeletionPolicy deletionPolicy, boolean readOnly, int termInfosIndexDivisor) throws CorruptIndexException, IOException
IndexDeletionPolicy
. You should pass
readOnly=true, since it gives much better concurrent
performance, unless you intend to do write operations
(delete documents or change norms) with the reader.
commit
- the specific IndexCommit
to open;
see listCommits(org.apache.lucene.store.Directory)
to list all commits
in a directorydeletionPolicy
- a custom deletion policy (only used
if you use this reader to perform deletes or to set
norms); see IndexWriter
for details.readOnly
- true if no changes (deletions, norms) will be made with this IndexReadertermInfosIndexDivisor
- Subsamples which indexed
terms are loaded into RAM. This has the same effect as IndexWriter.setTermIndexInterval(int)
except that setting
must be done at indexing time while this setting can be
set per reader. When set to N, then one in every
N*termIndexInterval terms in the index is loaded into
memory. By setting this to a value > 1 you can reduce
memory usage, at the expense of higher latency when
loading a TermInfo. The default value is 1. Set this
to -1 to skip loading the terms index entirely. This is only useful in
advanced situations when you will only .next() through all terms;
attempts to seek will hit an exception.
CorruptIndexException
- if the index is corrupt
IOException
- if there is a low-level IO errorpublic static IndexReader openIfChanged(IndexReader oldReader) throws IOException
This method is typically far less costly than opening a
fully new IndexReader
as it shares
resources (for example sub-readers) with the provided
IndexReader
, when possible.
The provided reader is not closed (you are responsible
for doing so); if a new reader is returned you also
must eventually close it. Be sure to never close a
reader while other threads are still using it; see
SearcherManager
in
contrib/misc
to simplify managing this.
If a new reader is returned, it's safe to make changes (deletions, norms) with it. All shared mutable state with the old reader uses "copy on write" semantics to ensure the changes are not seen by other readers.
CorruptIndexException
- if the index is corrupt
IOException
- if there is a low-level IO errorpublic static IndexReader openIfChanged(IndexReader oldReader, boolean readOnly) throws IOException
readOnly
; else, return
null.
IOException
openIfChanged(IndexReader)
public static IndexReader openIfChanged(IndexReader oldReader, IndexCommit commit) throws IOException
readOnly=true
reader; else, return null.
IOException
openIfChanged(IndexReader)
public static IndexReader openIfChanged(IndexReader oldReader, IndexWriter writer, boolean applyAllDeletes) throws IOException
IndexWriter
versus what the provided reader is
searching, then open and return a new read-only
IndexReader searching both committed and uncommitted
changes from the writer; else, return null (though, the
current implementation never returns null).
This provides "near real-time" searching, in that
changes made during an IndexWriter
session can be
quickly made available for searching without closing
the writer nor calling commit()
.
It's near real-time because there is no hard guarantee on how quickly you can get a new reader after making changes with IndexWriter. You'll have to experiment in your situation to determine if it's fast enough. As this is a new and experimental feature, please report back on your findings so we can learn, improve and iterate.
The very first time this method is called, this writer instance will make every effort to pool the readers that it opens for doing merges, applying deletes, etc. This means additional resources (RAM, file descriptors, CPU time) will be consumed.
For lower latency on reopening a reader, you should
call IndexWriterConfig.setMergedSegmentWarmer(org.apache.lucene.index.IndexWriter.IndexReaderWarmer)
to
pre-warm a newly merged segment before it's committed
to the index. This is important for minimizing
index-to-search delay after a large merge.
If an addIndexes* call is running in another thread, then this reader will only search those segments from the foreign index that have been successfully copied over, so far.
NOTE: Once the writer is closed, any
outstanding readers may continue to be used. However,
if you attempt to reopen any of those readers, you'll
hit an AlreadyClosedException
.
writer
- The IndexWriter to open fromapplyAllDeletes
- If true, all buffered deletes will
be applied (made visible) in the returned reader. If
false, the deletes are not applied but remain buffered
(in IndexWriter) so that they will be applied in the
future. Applying deletes can be costly, so if your app
can tolerate deleted documents being returned you might
gain some performance by passing false.
IOException
@Deprecated public IndexReader reopen() throws CorruptIndexException, IOException
Opening an IndexReader is an expensive operation. This method can be used to refresh an existing IndexReader to reduce these costs. This method tries to only load segments that have changed or were created after the IndexReader was (re)opened.
If the index has not changed since this instance was (re)opened, then this
call is a NOOP and returns this instance. Otherwise, a new instance is
returned. The old instance is not closed and remains usable.
If the reader is reopened, even though they share resources internally, it's safe to make changes (deletions, norms) with the new reader. All shared mutable state obeys "copy on write" semantics to ensure the changes are not seen by other readers.
You can determine whether a reader was actually reopened by comparing the old instance with the instance returned by this method:
IndexReader reader = ... ... IndexReader newReader = r.reopen(); if (newReader != reader) { ... // reader was reopened reader.close(); } reader = newReader; ...Be sure to synchronize that code so that other threads, if present, can never use reader after it has been closed and before it's switched to newReader.
NOTE: If this reader is a near real-time
reader (obtained from IndexWriter.getReader()
,
reopen() will simply call writer.getReader() again for
you, though this may change in the future.
CorruptIndexException
- if the index is corrupt
IOException
- if there is a low-level IO error@Deprecated public IndexReader reopen(boolean openReadOnly) throws CorruptIndexException, IOException
reopen()
, except you can change the
readOnly of the original reader. If the index is
unchanged but readOnly is different then a new reader
will be returned.
CorruptIndexException
IOException
@Deprecated public IndexReader reopen(IndexCommit commit) throws CorruptIndexException, IOException
CorruptIndexException
IOException
@Deprecated public IndexReader reopen(IndexWriter writer, boolean applyAllDeletes) throws CorruptIndexException, IOException
commit()
.
Note that this is functionally equivalent to calling
{#flush} (an internal IndexWriter operation) and then using open(org.apache.lucene.store.Directory)
to
open a new reader. But the turnaround time of this
method should be faster since it avoids the potentially
costly commit()
.
You must close the IndexReader
returned by
this method once you are done using it.
It's near real-time because there is no hard guarantee on how quickly you can get a new reader after making changes with IndexWriter. You'll have to experiment in your situation to determine if it's fast enough. As this is a new and experimental feature, please report back on your findings so we can learn, improve and iterate.
The resulting reader supports reopen()
, but that call will simply forward
back to this method (though this may change in the
future).
The very first time this method is called, this writer instance will make every effort to pool the readers that it opens for doing merges, applying deletes, etc. This means additional resources (RAM, file descriptors, CPU time) will be consumed.
For lower latency on reopening a reader, you should
call IndexWriterConfig.setMergedSegmentWarmer(org.apache.lucene.index.IndexWriter.IndexReaderWarmer)
to
pre-warm a newly merged segment before it's committed
to the index. This is important for minimizing
index-to-search delay after a large merge.
If an addIndexes* call is running in another thread, then this reader will only search those segments from the foreign index that have been successfully copied over, so far
.NOTE: Once the writer is closed, any
outstanding readers may continue to be used. However,
if you attempt to reopen any of those readers, you'll
hit an AlreadyClosedException
.
writer
- The IndexWriter to open fromapplyAllDeletes
- If true, all buffered deletes will
be applied (made visible) in the returned reader. If
false, the deletes are not applied but remain buffered
(in IndexWriter) so that they will be applied in the
future. Applying deletes can be costly, so if your app
can tolerate deleted documents being returned you might
gain some performance by passing false.
IOException
CorruptIndexException
protected IndexReader doOpenIfChanged() throws CorruptIndexException, IOException
null
.
CorruptIndexException
IOException
openIfChanged(IndexReader)
protected IndexReader doOpenIfChanged(boolean openReadOnly) throws CorruptIndexException, IOException
null
.
CorruptIndexException
IOException
openIfChanged(IndexReader, boolean)
protected IndexReader doOpenIfChanged(IndexCommit commit) throws CorruptIndexException, IOException
null
.
CorruptIndexException
IOException
openIfChanged(IndexReader, IndexCommit)
protected IndexReader doOpenIfChanged(IndexWriter writer, boolean applyAllDeletes) throws CorruptIndexException, IOException
null
.
CorruptIndexException
IOException
openIfChanged(IndexReader, IndexWriter, boolean)
public Object clone()
On cloning a reader with pending changes (deletions, norms), the original reader transfers its write lock to the cloned reader. This means only the cloned reader may make further changes to the index, and commit the changes to the index on close, but the old reader still reflects all changes made up until it was cloned.
Like openIfChanged(IndexReader)
, it's safe to make changes to
either the original or the cloned reader: all shared
mutable state obeys "copy on write" semantics to ensure
the changes are not seen by other readers.
clone
in class Object
public IndexReader clone(boolean openReadOnly) throws CorruptIndexException, IOException
CorruptIndexException
- if the index is corrupt
IOException
- if there is a low-level IO errorpublic Directory directory()
UnsupportedOperationException
- if no directorypublic static long lastModified(Directory directory2) throws CorruptIndexException, IOException
isCurrent()
instead.
CorruptIndexException
- if the index is corrupt
IOException
- if there is a low-level IO errorpublic static long getCurrentVersion(Directory directory) throws CorruptIndexException, IOException
directory
- where the index resides.
CorruptIndexException
- if the index is corrupt
IOException
- if there is a low-level IO errorpublic static Map<String,String> getCommitUserData(Directory directory) throws CorruptIndexException, IOException
IndexWriter.commit(Map)
, from current index
segments file. This will return null if IndexWriter.commit(Map)
has never been called for
this index.
directory
- where the index resides.
CorruptIndexException
- if the index is corrupt
IOException
- if there is a low-level IO errorgetCommitUserData()
public long getVersion()
If this reader is based on a Directory (ie, was
created by calling open(org.apache.lucene.store.Directory)
, or openIfChanged(org.apache.lucene.index.IndexReader)
on
a reader based on a Directory), then this method
returns the version recorded in the commit that the
reader opened. This version is advanced every time
IndexWriter.commit()
is called.
If instead this reader is a near real-time reader
(ie, obtained by a call to IndexWriter.getReader()
, or by calling openIfChanged(org.apache.lucene.index.IndexReader)
on a near real-time reader), then this method returns
the version of the last commit done by the writer.
Note that even as further changes are made with the
writer, the version will not changed until a commit is
completed. Thus, you should not rely on this method to
determine when a near real-time reader should be
opened. Use isCurrent()
instead.
UnsupportedOperationException
- unless overridden in subclasspublic Map<String,String> getCommitUserData()
IndexWriter.commit(Map)
has never been called for
this index.
getCommitUserData(Directory)
public boolean isCurrent() throws CorruptIndexException, IOException
If this reader is based on a Directory (ie, was
created by calling open(org.apache.lucene.store.Directory)
, or openIfChanged(org.apache.lucene.index.IndexReader)
on
a reader based on a Directory), then this method checks
if any further commits (see IndexWriter.commit()
have occurred in that directory).
If instead this reader is a near real-time reader
(ie, obtained by a call to IndexWriter.getReader()
, or by calling openIfChanged(org.apache.lucene.index.IndexReader)
on a near real-time reader), then this method checks if
either a new commmit has occurred, or any new
uncommitted changes have taken place via the writer.
Note that even if the writer has only performed
merging, this method will still return false.
In any event, if this returns false, you should call
openIfChanged(org.apache.lucene.index.IndexReader)
to get a new reader that sees the
changes.
CorruptIndexException
- if the index is corrupt
IOException
- if there is a low-level IO error
UnsupportedOperationException
- unless overridden in subclass@Deprecated public boolean isOptimized()
getSequentialSubReaders()
instead.
public abstract TermFreqVector[] getTermFreqVectors(int docNumber) throws IOException
TermFreqVector
or of type TermPositionVector
if
positions or offsets have been stored.
docNumber
- document for which term frequency vectors are returned
IOException
- if index cannot be accessedField.TermVector
public abstract TermFreqVector getTermFreqVector(int docNumber, String field) throws IOException
TermPositionVector
is returned.
docNumber
- document for which the term frequency vector is returnedfield
- field for which the term frequency vector is returned.
IOException
- if index cannot be accessedField.TermVector
public abstract void getTermFreqVector(int docNumber, String field, TermVectorMapper mapper) throws IOException
TermFreqVector
.
docNumber
- The number of the document to load the vector forfield
- The name of the field to loadmapper
- The TermVectorMapper
to process the vector. Must not be null
IOException
- if term vectors cannot be accessed or if they do not exist on the field and doc. specified.public abstract void getTermFreqVector(int docNumber, TermVectorMapper mapper) throws IOException
docNumber
- The number of the document to load the vector formapper
- The TermVectorMapper
to process the vector. Must not be null
IOException
- if term vectors cannot be accessed or if they do not exist on the field and doc. specified.public static boolean indexExists(Directory directory) throws IOException
true
if an index exists at the specified directory.
directory
- the directory to check for an index
true
if an index exists; false
otherwise
IOException
- if there is a problem with accessing the indexpublic abstract int numDocs()
public abstract int maxDoc()
public int numDeletedDocs()
public Document document(int n) throws CorruptIndexException, IOException
n
th
Document
in this index.
NOTE: for performance reasons, this method does not check if the
requested document is deleted, and therefore asking for a deleted document
may yield unspecified results. Usually this is not required, however you
can call isDeleted(int)
with the requested document ID to verify
the document is not deleted.
CorruptIndexException
- if the index is corrupt
IOException
- if there is a low-level IO errorpublic abstract Document document(int n, FieldSelector fieldSelector) throws CorruptIndexException, IOException
Document
at the n
th position. The FieldSelector
may be used to determine
what Field
s to load and how they should
be loaded. NOTE: If this Reader (more specifically, the underlying
FieldsReader
) is closed before the lazy
Field
is loaded an exception may be
thrown. If you want the value of a lazy
Field
to be available after closing you
must explicitly load it or fetch the Document again with a new loader.
NOTE: for performance reasons, this method does not check if the
requested document is deleted, and therefore asking for a deleted document
may yield unspecified results. Usually this is not required, however you
can call isDeleted(int)
with the requested document ID to verify
the document is not deleted.
n
- Get the document at the n
th positionfieldSelector
- The FieldSelector
to use to determine what
Fields should be loaded on the Document. May be null, in which case
all Fields will be loaded.
Document
at the nth position
CorruptIndexException
- if the index is corrupt
IOException
- if there is a low-level IO errorFieldable
,
FieldSelector
,
SetBasedFieldSelector
,
LoadFirstFieldSelector
public abstract boolean isDeleted(int n)
public abstract boolean hasDeletions()
public boolean hasNorms(String field) throws IOException
IOException
public abstract byte[] norms(String field) throws IOException
IOException
AbstractField.setBoost(float)
public abstract void norms(String field, byte[] bytes, int offset) throws IOException
IOException
AbstractField.setBoost(float)
public void setNorm(int doc, String field, byte value) throws StaleReaderException, CorruptIndexException, LockObtainFailedException, IOException
boost
and its length normalization
. Thus, to preserve the length normalization
values when resetting this, one should base the new value upon the old.
NOTE: If this field does not index norms, then
this method throws IllegalStateException
.
StaleReaderException
- if the index has changed
since this reader was opened
CorruptIndexException
- if the index is corrupt
LockObtainFailedException
- if another writer
has this index open (write.lock
could not
be obtained)
IOException
- if there is a low-level IO error
IllegalStateException
- if the field does not index normsnorms(String)
,
Similarity.decodeNormValue(byte)
protected abstract void doSetNorm(int doc, String field, byte value) throws CorruptIndexException, IOException
CorruptIndexException
IOException
@Deprecated public void setNorm(int doc, String field, float value) throws StaleReaderException, CorruptIndexException, LockObtainFailedException, IOException
setNorm(int, String, byte)
instead, encoding the
float to byte with your Similarity's Similarity.encodeNormValue(float)
.
This method will be removed in Lucene 4.0
StaleReaderException
- if the index has changed
since this reader was opened
CorruptIndexException
- if the index is corrupt
LockObtainFailedException
- if another writer
has this index open (write.lock
could not
be obtained)
IOException
- if there is a low-level IO errornorms(String)
,
Similarity.decodeNormValue(byte)
public abstract TermEnum terms() throws IOException
TermEnum.next()
must be called
on the resulting enumeration before calling other methods such as
TermEnum.term()
.
IOException
- if there is a low-level IO errorpublic abstract TermEnum terms(Term t) throws IOException
IOException
- if there is a low-level IO errorpublic abstract int docFreq(Term t) throws IOException
t
.
IOException
- if there is a low-level IO errorpublic TermDocs termDocs(Term term) throws IOException
term
. For each document, the document number, the frequency of
the term in that document is also provided, for use in
search scoring. If term is null, then all non-deleted
docs are returned with freq=1.
Thus, this method implements the mapping:
The enumeration is ordered by document number. Each document number is greater than all that precede it in the enumeration.
IOException
- if there is a low-level IO errorpublic abstract TermDocs termDocs() throws IOException
TermDocs
enumerator.
Note: the TermDocs returned is unpositioned. Before using it, ensure
that you first position it with TermDocs.seek(Term)
or
TermDocs.seek(TermEnum)
.
IOException
- if there is a low-level IO errorpublic TermPositions termPositions(Term term) throws IOException
term
. For each document, in addition to the document number
and frequency of the term in that document, a list of all of the ordinal
positions of the term in the document is available. Thus, this method
implements the mapping:
This positional information facilitates phrase and proximity searching.
The enumeration is ordered by document number. Each document number is greater than all that precede it in the enumeration.
IOException
- if there is a low-level IO errorpublic abstract TermPositions termPositions() throws IOException
TermPositions
enumerator.
IOException
- if there is a low-level IO errorpublic void deleteDocument(int docNum) throws StaleReaderException, CorruptIndexException, LockObtainFailedException, IOException
docNum
. Once a document is
deleted it will not appear in TermDocs or TermPostitions enumerations.
Attempts to read its field with the document(int)
method will result in an error. The presence of this document may still be
reflected in the docFreq(org.apache.lucene.index.Term)
statistic, though
this will be corrected eventually as the index is further modified.
StaleReaderException
- if the index has changed
since this reader was opened
CorruptIndexException
- if the index is corrupt
LockObtainFailedException
- if another writer
has this index open (write.lock
could not
be obtained)
IOException
- if there is a low-level IO errorprotected abstract void doDelete(int docNum) throws CorruptIndexException, IOException
docNum
.
Applications should call deleteDocument(int)
or deleteDocuments(Term)
.
CorruptIndexException
IOException
public int deleteDocuments(Term term) throws StaleReaderException, CorruptIndexException, LockObtainFailedException, IOException
term
indexed.
This is useful if one uses a document field to hold a unique ID string for
the document. Then to delete such a document, one merely constructs a
term with the appropriate field and the unique ID string as its text and
passes it to this method.
See deleteDocument(int)
for information about when this deletion will
become effective.
StaleReaderException
- if the index has changed
since this reader was opened
CorruptIndexException
- if the index is corrupt
LockObtainFailedException
- if another writer
has this index open (write.lock
could not
be obtained)
IOException
- if there is a low-level IO errorpublic void undeleteAll() throws StaleReaderException, CorruptIndexException, LockObtainFailedException, IOException
NOTE: this method can only recover documents marked
for deletion but not yet removed from the index; when
and how Lucene removes deleted documents is an
implementation detail, subject to change from release
to release. However, you can use numDeletedDocs()
on the current IndexReader instance to
see how many documents will be un-deleted.
StaleReaderException
- if the index has changed
since this reader was opened
LockObtainFailedException
- if another writer
has this index open (write.lock
could not
be obtained)
CorruptIndexException
- if the index is corrupt
IOException
- if there is a low-level IO errorprotected abstract void doUndeleteAll() throws CorruptIndexException, IOException
CorruptIndexException
IOException
protected void acquireWriteLock() throws IOException
IOException
public final void flush() throws IOException
IOException
public final void flush(Map<String,String> commitUserData) throws IOException
commitUserData
- Opaque Map (String -> String)
that's recorded into the segments file in the index,
and retrievable by getCommitUserData(org.apache.lucene.store.Directory)
.
IOException
protected final void commit() throws IOException
IOException
- if there is a low-level IO errorpublic final void commit(Map<String,String> commitUserData) throws IOException
IOException
- if there is a low-level IO errorprotected abstract void doCommit(Map<String,String> commitUserData) throws IOException
IOException
public final void close() throws IOException
close
in interface Closeable
IOException
- if there is a low-level IO errorprotected abstract void doClose() throws IOException
IOException
public abstract Collection<String> getFieldNames(IndexReader.FieldOption fldOption)
fldOption
- specifies which field option should be available for the returned fields
IndexReader.FieldOption
public IndexCommit getIndexCommit() throws IOException
IOException
public static void main(String[] args)
args
- Usage: org.apache.lucene.index.IndexReader [-extract] <cfsfile>public static Collection<IndexCommit> listCommits(Directory dir) throws IOException
KeepOnlyLastCommitDeletionPolicy
, there would be only
one commit point. But if you're using a custom IndexDeletionPolicy
then there could be many commits.
Once you have a given commit, you can open a reader on
it by calling open(IndexCommit,boolean)
There must be at least one commit in
the Directory, else this method throws IndexNotFoundException
. Note that if a commit is in
progress while this method is running, that commit
may or may not be returned.
IndexCommit
s, from oldest
to latest.
IOException
public IndexReader[] getSequentialSubReaders()
NOTE: You should not try using sub-readers returned by
this method to make any changes (setNorm, deleteDocument,
etc.). While this might succeed for one composite reader
(like MultiReader), it will most likely lead to index
corruption for other readers (like DirectoryReader obtained
through open(org.apache.lucene.store.Directory)
. Use the parent reader directly.
public Object getCoreCacheKey()
public Object getDeletesCacheKey()
public long getUniqueTermCount() throws IOException
UnsupportedOperationException
- if this count
cannot be easily determined (eg Multi*Readers).
Instead, you should call getSequentialSubReaders()
and ask each sub reader for
its unique term count.
IOException
public int getTermInfosIndexDivisor()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |