SegmentReader (Lucene 3.4.0 API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.lucene.index
Class SegmentReader

java.lang.Object
  org.apache.lucene.index.IndexReader
      org.apache.lucene.index.SegmentReader

All Implemented Interfaces:: Closeable, Cloneable

public class SegmentReader
extends IndexReader
implements Cloneable
extends IndexReader
implements Cloneable

WARNING: This API is experimental and might change in incompatible ways in the next release.

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.lucene.index.IndexReader
`IndexReader.FieldOption, IndexReader.ReaderFinishedListener`

Field Summary
`protected boolean`	`readOnly`

Fields inherited from class org.apache.lucene.index.IndexReader
`hasChanges, readerFinishedListeners`

Constructor Summary
`SegmentReader()`

Method Summary
`Object`	`clone()` Efficiently clones the IndexReader (sharing most internal state).
`IndexReader`	`clone(boolean openReadOnly)` Clones the IndexReader and optionally changes readOnly.
`protected BitVector`	`cloneDeletedDocs(BitVector bv)` Clones the deleteDocs BitVector.
`protected byte[]`	`cloneNormBytes(byte[] bytes)` Clones the norm bytes.
`Directory`	`directory()` Returns the directory this index resides in.
`int`	`docFreq(Term t)` Returns the number of documents containing the term `t`.
`protected void`	`doClose()` Implements close.
`protected void`	`doCommit(Map<String,String> commitUserData)` Implements commit.
`Document`	`document(int n, FieldSelector fieldSelector)` Get the `Document` at the `n` ^th position.
`protected void`	`doDelete(int docNum)` Implements deletion of the document numbered `docNum`.
`protected void`	`doSetNorm(int doc, String field, byte value)` Implements setNorm in subclass.
`protected void`	`doUndeleteAll()` Implements actual undeleteAll() in subclass.
`static SegmentReader`	`get(boolean readOnly, Directory dir, SegmentInfo si, int readBufferSize, boolean doOpenStores, int termInfosIndexDivisor)`
`static SegmentReader`	`get(boolean readOnly, SegmentInfo si, int termInfosIndexDivisor)`
`Object`	`getCoreCacheKey()` Expert
`Object`	`getDeletesCacheKey()` Expert.
`Collection<String>`	`getFieldNames(IndexReader.FieldOption fieldOption)` Get a list of unique field names that exist in this index and have the specified field option information.
`String`	`getSegmentName()` Return the name of the segment this reader is reading.
`TermFreqVector`	`getTermFreqVector(int docNumber, String field)` Return a term frequency vector for the specified document and field.
`void`	`getTermFreqVector(int docNumber, String field, TermVectorMapper mapper)` Load the Term Vector into a user-defined data structure instead of relying on the parallel arrays of the `TermFreqVector`.
`void`	`getTermFreqVector(int docNumber, TermVectorMapper mapper)` Map all the term vectors for all fields in a Document
`TermFreqVector[]`	`getTermFreqVectors(int docNumber)` Return an array of term frequency vectors for the specified document.
`int`	`getTermInfosIndexDivisor()` For IndexReader implementations that use TermInfosReader to read terms, this returns the current indexDivisor as specified when the reader was opened.
`long`	`getUniqueTermCount()` Returns the number of unique terms (across all fields) in this reader.
`boolean`	`hasDeletions()` Returns true if any documents have been deleted
`boolean`	`hasNorms(String field)` Returns true if there are norms stored for this field.
`boolean`	`isDeleted(int n)` Returns true if document n has been deleted
`int`	`maxDoc()` Returns one greater than the largest possible document number.
`byte[]`	`norms(String field)` Returns the byte-encoded normalization factor for the named field of every document.
`void`	`norms(String field, byte[] bytes, int offset)` Read norms into a pre-allocated array.
`int`	`numDocs()` Returns the number of documents in this index.
`protected void`	`readerFinished()`
`IndexReader`	`reopen()` Refreshes an IndexReader if the index has changed since this instance was (re)opened.
`IndexReader`	`reopen(boolean openReadOnly)` Just like `IndexReader.reopen()`, except you can change the readOnly of the original reader.
`TermDocs`	`termDocs()` Returns an unpositioned `TermDocs` enumerator.
`TermDocs`	`termDocs(Term term)` Returns an enumeration of all the documents which contain `term`.
`TermPositions`	`termPositions()` Returns an unpositioned `TermPositions` enumerator.
`TermEnum`	`terms()` Returns an enumeration of all the terms in the index.
`TermEnum`	`terms(Term t)` Returns an enumeration of all terms starting at a given term.
`String`	`toString()`

Methods inherited from class org.apache.lucene.index.IndexReader
acquireWriteLock, addReaderFinishedListener, close, commit, commit, decRef, deleteDocument, deleteDocuments, document, ensureOpen, flush, flush, getCommitUserData, getCommitUserData, getCurrentVersion, getIndexCommit, getRefCount, getSequentialSubReaders, getVersion, incRef, indexExists, isCurrent, isOptimized, lastModified, listCommits, main, notifyReaderFinishedListeners, numDeletedDocs, open, open, open, open, open, open, open, open, removeReaderFinishedListener, reopen, reopen, setNorm, setNorm, termPositions, undeleteAll

Methods inherited from class org.apache.lucene.index.IndexReader

acquireWriteLock, addReaderFinishedListener, close, commit, commit, decRef, deleteDocument, deleteDocuments, document, ensureOpen, flush, flush, getCommitUserData, getCommitUserData, getCurrentVersion, getIndexCommit, getRefCount, getSequentialSubReaders, getVersion, incRef, indexExists, isCurrent, isOptimized, lastModified, listCommits, main, notifyReaderFinishedListeners, numDeletedDocs, open, open, open, open, open, open, open, open, removeReaderFinishedListener, reopen, reopen, setNorm, setNorm, termPositions, undeleteAll

Methods inherited from class java.lang.Object
`equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait`

Field Detail

readOnly

protected boolean readOnly

Constructor Detail

SegmentReader

public SegmentReader()

Method Detail

get

public static SegmentReader get(boolean readOnly,
                                SegmentInfo si,
                                int termInfosIndexDivisor)
                         throws CorruptIndexException,
                                IOException

Throws:: CorruptIndexException - if the index is corrupt; IOException - if there is a low-level IO error

get

public static SegmentReader get(boolean readOnly,
                                Directory dir,
                                SegmentInfo si,
                                int readBufferSize,
                                boolean doOpenStores,
                                int termInfosIndexDivisor)
                         throws CorruptIndexException,
                                IOException

Throws:: CorruptIndexException - if the index is corrupt; IOException - if there is a low-level IO error

cloneNormBytes

protected byte[] cloneNormBytes(byte[] bytes)

Clones the norm bytes. May be overridden by subclasses. New and experimental.

Parameters:: bytes - Byte array to clone
Returns:: New BitVector

cloneDeletedDocs

protected BitVector cloneDeletedDocs(BitVector bv)

Clones the deleteDocs BitVector. May be overridden by subclasses. New and experimental.

Parameters:: bv - BitVector to clone
Returns:: New BitVector

clone

public final Object clone()

Description copied from class: IndexReader

Efficiently clones the IndexReader (sharing most internal state).

On cloning a reader with pending changes (deletions, norms), the original reader transfers its write lock to the cloned reader. This means only the cloned reader may make further changes to the index, and commit the changes to the index on close, but the old reader still reflects all changes made up until it was cloned.

Like IndexReader.reopen(), it's safe to make changes to either the original or the cloned reader: all shared mutable state obeys "copy on write" semantics to ensure the changes are not seen by other readers.

Overrides:: clone in class IndexReader

clone

public final IndexReader clone(boolean openReadOnly)
                        throws CorruptIndexException,
                               IOException

Description copied from class: IndexReader

Clones the IndexReader and optionally changes readOnly. A readOnly reader cannot open a writeable reader.

Overrides:: clone in class IndexReader

Throws:: CorruptIndexException - if the index is corrupt; IOException - if there is a low-level IO error

reopen

public IndexReader reopen()
                   throws CorruptIndexException,
                          IOException

Description copied from class: IndexReader

Refreshes an IndexReader if the index has changed since this instance was (re)opened.

Opening an IndexReader is an expensive operation. This method can be used to refresh an existing IndexReader to reduce these costs. This method tries to only load segments that have changed or were created after the IndexReader was (re)opened.

If the index has not changed since this instance was (re)opened, then this call is a NOOP and returns this instance. Otherwise, a new instance is returned. The old instance is not closed and remains usable.

If the reader is reopened, even though they share resources internally, it's safe to make changes (deletions, norms) with the new reader. All shared mutable state obeys "copy on write" semantics to ensure the changes are not seen by other readers.

You can determine whether a reader was actually reopened by comparing the old instance with the instance returned by this method:

 IndexReader reader = ... 
 ...
 IndexReader newReader = r.reopen();
 if (newReader != reader) {
 ...     // reader was reopened
   reader.close(); 
 }
 reader = newReader;
 ...

Be sure to synchronize that code so that other threads, if present, can never use reader after it has been closed and before it's switched to newReader.

NOTE: If this reader is a near real-time reader (obtained from IndexWriter.getReader(), reopen() will simply call writer.getReader() again for you, though this may change in the future.

Overrides:: reopen in class IndexReader

Throws:: CorruptIndexException - if the index is corrupt; IOException - if there is a low-level IO error

reopen

public IndexReader reopen(boolean openReadOnly)
                   throws CorruptIndexException,
                          IOException

Description copied from class: IndexReader

Just like IndexReader.reopen(), except you can change the readOnly of the original reader. If the index is unchanged but readOnly is different then a new reader will be returned.

Overrides:: reopen in class IndexReader

Throws:: CorruptIndexException; IOException

doCommit

protected void doCommit(Map<String,String> commitUserData)
                 throws IOException

Description copied from class: IndexReader

Implements commit.

Specified by:: doCommit in class IndexReader

Throws:: IOException

doClose

protected void doClose()
                throws IOException

Description copied from class: IndexReader

Implements close.

Specified by:: doClose in class IndexReader

Throws:: IOException

hasDeletions

public boolean hasDeletions()

Description copied from class: IndexReader

Returns true if any documents have been deleted

Specified by:: hasDeletions in class IndexReader

doDelete

protected void doDelete(int docNum)

Description copied from class: IndexReader

Implements deletion of the document numbered docNum. Applications should call IndexReader.deleteDocument(int) or IndexReader.deleteDocuments(Term).

Specified by:: doDelete in class IndexReader

doUndeleteAll

protected void doUndeleteAll()

Description copied from class: IndexReader

Implements actual undeleteAll() in subclass.

Specified by:: doUndeleteAll in class IndexReader

terms

public TermEnum terms()

Description copied from class: IndexReader

Returns an enumeration of all the terms in the index. The enumeration is ordered by Term.compareTo(). Each term is greater than all that precede it in the enumeration. Note that after calling terms(), TermEnum.next() must be called on the resulting enumeration before calling other methods such as TermEnum.term().

Specified by:: terms in class IndexReader

terms

public TermEnum terms(Term t)
               throws IOException

Description copied from class: IndexReader

Returns an enumeration of all terms starting at a given term. If the given term does not exist, the enumeration is positioned at the first term greater than the supplied term. The enumeration is ordered by Term.compareTo(). Each term is greater than all that precede it in the enumeration.

Specified by:: terms in class IndexReader

Throws:: IOException - if there is a low-level IO error

document

public Document document(int n,
                         FieldSelector fieldSelector)
                  throws CorruptIndexException,
                         IOException

Description copied from class: IndexReader

Get the Document at the n ^th position. The FieldSelector may be used to determine what Fields to load and how they should be loaded. NOTE: If this Reader (more specifically, the underlying FieldsReader) is closed before the lazy Field is loaded an exception may be thrown. If you want the value of a lazy Field to be available after closing you must explicitly load it or fetch the Document again with a new loader.

NOTE: for performance reasons, this method does not check if the requested document is deleted, and therefore asking for a deleted document may yield unspecified results. Usually this is not required, however you can call IndexReader.isDeleted(int) with the requested document ID to verify the document is not deleted.

Specified by:: document in class IndexReader

Parameters:: n - Get the document at the n^th position; fieldSelector - The FieldSelector to use to determine what Fields should be loaded on the Document. May be null, in which case all Fields will be loaded.
Returns:: The stored fields of the Document at the nth position
Throws:: CorruptIndexException - if the index is corrupt; IOException - if there is a low-level IO error
See Also:: Fieldable, FieldSelector, SetBasedFieldSelector, LoadFirstFieldSelector

isDeleted

public boolean isDeleted(int n)

Description copied from class: IndexReader

Returns true if document n has been deleted

Specified by:: isDeleted in class IndexReader

termDocs

public TermDocs termDocs(Term term)
                  throws IOException

Description copied from class: IndexReader

Returns an enumeration of all the documents which contain term. For each document, the document number, the frequency of the term in that document is also provided, for use in search scoring. If term is null, then all non-deleted docs are returned with freq=1. Thus, this method implements the mapping:

The enumeration is ordered by document number. Each document number is greater than all that precede it in the enumeration.

Overrides:: termDocs in class IndexReader

Throws:: IOException - if there is a low-level IO error

termDocs

public TermDocs termDocs()
                  throws IOException

Description copied from class: IndexReader

Returns an unpositioned TermDocs enumerator.

Note: the TermDocs returned is unpositioned. Before using it, ensure that you first position it with TermDocs.seek(Term) or TermDocs.seek(TermEnum).

Specified by:: termDocs in class IndexReader

Throws:: IOException - if there is a low-level IO error

termPositions

public TermPositions termPositions()
                            throws IOException

Description copied from class: IndexReader

Returns an unpositioned TermPositions enumerator.

Specified by:: termPositions in class IndexReader

Throws:: IOException - if there is a low-level IO error

docFreq

public int docFreq(Term t)
            throws IOException

Description copied from class: IndexReader

Returns the number of documents containing the term t.

Specified by:: docFreq in class IndexReader

Throws:: IOException - if there is a low-level IO error

numDocs

public int numDocs()

Description copied from class: IndexReader

Returns the number of documents in this index.

Specified by:: numDocs in class IndexReader

maxDoc

public int maxDoc()

Description copied from class: IndexReader

Returns one greater than the largest possible document number. This may be used to, e.g., determine how big to allocate an array which will have an element for every document number in an index.

Specified by:: maxDoc in class IndexReader

getFieldNames

public Collection<String> getFieldNames(IndexReader.FieldOption fieldOption)

Description copied from class: IndexReader

Get a list of unique field names that exist in this index and have the specified field option information.

Specified by:: getFieldNames in class IndexReader

Parameters:: fieldOption - specifies which field option should be available for the returned fields
Returns:: Collection of Strings indicating the names of the fields.
See Also:: IndexReader.getFieldNames(org.apache.lucene.index.IndexReader.FieldOption)

hasNorms

public boolean hasNorms(String field)

Description copied from class: IndexReader

Returns true if there are norms stored for this field.

Overrides:: hasNorms in class IndexReader

norms

public byte[] norms(String field)
             throws IOException

Description copied from class: IndexReader

Returns the byte-encoded normalization factor for the named field of every document. This is used by the search code to score documents. Returns null if norms were not indexed for this field.

Specified by:: norms in class IndexReader

Throws:: IOException
See Also:: AbstractField.setBoost(float)

doSetNorm

protected void doSetNorm(int doc,
                         String field,
                         byte value)
                  throws IOException

Description copied from class: IndexReader

Implements setNorm in subclass.

Specified by:: doSetNorm in class IndexReader

Throws:: IOException

norms

public void norms(String field,
                  byte[] bytes,
                  int offset)
           throws IOException

Read norms into a pre-allocated array.

Specified by:: norms in class IndexReader

Throws:: IOException
See Also:: AbstractField.setBoost(float)

getTermFreqVector

public TermFreqVector getTermFreqVector(int docNumber,
                                        String field)
                                 throws IOException

Return a term frequency vector for the specified document and field. The vector returned contains term numbers and frequencies for all terms in the specified field of this document, if the field had storeTermVector flag set. If the flag was not set, the method returns null.

Specified by:: getTermFreqVector in class IndexReader

Parameters:: docNumber - document for which the term frequency vector is returned; field - field for which the term frequency vector is returned.
Returns:: term frequency vector May be null if field does not exist in the specified document or term vector was not stored.
Throws:: IOException
See Also:: Field.TermVector

getTermFreqVector

public void getTermFreqVector(int docNumber,
                              String field,
                              TermVectorMapper mapper)
                       throws IOException

Description copied from class: IndexReader

Load the Term Vector into a user-defined data structure instead of relying on the parallel arrays of the TermFreqVector.

Specified by:: getTermFreqVector in class IndexReader

Parameters:: docNumber - The number of the document to load the vector for; field - The name of the field to load; mapper - The TermVectorMapper to process the vector. Must not be null
Throws:: IOException - if term vectors cannot be accessed or if they do not exist on the field and doc. specified.

getTermFreqVector

public void getTermFreqVector(int docNumber,
                              TermVectorMapper mapper)
                       throws IOException

Description copied from class: IndexReader

Map all the term vectors for all fields in a Document

Specified by:: getTermFreqVector in class IndexReader

Parameters:: docNumber - The number of the document to load the vector for; mapper - The TermVectorMapper to process the vector. Must not be null
Throws:: IOException - if term vectors cannot be accessed or if they do not exist on the field and doc. specified.

getTermFreqVectors

public TermFreqVector[] getTermFreqVectors(int docNumber)
                                    throws IOException

Return an array of term frequency vectors for the specified document. The array contains a vector for each vectorized field in the document. Each vector vector contains term numbers and frequencies for all terms in a given vectorized field. If no such fields existed, the method returns null.

Specified by:: getTermFreqVectors in class IndexReader

Parameters:: docNumber - document for which term frequency vectors are returned
Returns:: array of term frequency vectors. May be null if no term vectors have been stored for the specified document.
Throws:: IOException
See Also:: Field.TermVector

toString

public String toString()

Overrides:: toString in class IndexReader

getSegmentName

public String getSegmentName()

Return the name of the segment this reader is reading.

getCoreCacheKey

public final Object getCoreCacheKey()

Description copied from class: IndexReader

Expert

Overrides:: getCoreCacheKey in class IndexReader

getDeletesCacheKey

public Object getDeletesCacheKey()

Description copied from class: IndexReader

Expert. Warning: this returns null if the reader has no deletions

Overrides:: getDeletesCacheKey in class IndexReader

getUniqueTermCount

public long getUniqueTermCount()

Description copied from class: IndexReader

Returns the number of unique terms (across all fields) in this reader. This method returns long, even though internally Lucene cannot handle more than 2^31 unique terms, for a possible future when this limitation is removed.

Overrides:: getUniqueTermCount in class IndexReader

getTermInfosIndexDivisor

public int getTermInfosIndexDivisor()

Description copied from class: IndexReader

For IndexReader implementations that use TermInfosReader to read terms, this returns the current indexDivisor as specified when the reader was opened.

Overrides:: getTermInfosIndexDivisor in class IndexReader

readerFinished

protected void readerFinished()

Overrides:: readerFinished in class IndexReader

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.lucene.index Class SegmentReader

readOnly

SegmentReader

get

get

cloneNormBytes

cloneDeletedDocs

clone

clone

reopen

reopen

doCommit

doClose

hasDeletions

doDelete

doUndeleteAll

terms

terms

document

isDeleted

termDocs

termDocs

termPositions

docFreq

numDocs

maxDoc

getFieldNames

hasNorms

norms

doSetNorm

norms

getTermFreqVector

getTermFreqVector

getTermFreqVector

getTermFreqVectors

toString

getSegmentName

directory

getCoreCacheKey

getDeletesCacheKey

getUniqueTermCount

getTermInfosIndexDivisor

readerFinished

org.apache.lucene.index
Class SegmentReader