ParallelReader (Lucene 3.0.3 API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.lucene.index
Class ParallelReader

java.lang.Object
  org.apache.lucene.index.IndexReader
      org.apache.lucene.index.ParallelReader

All Implemented Interfaces:: Closeable, Cloneable

public class ParallelReader
extends IndexReader
extends IndexReader

An IndexReader which reads multiple, parallel indexes. Each index added must have the same number of documents, but typically each contains different fields. Each document contains the union of the fields of all documents with the same document number. When searching, matches for a query term are from the first index added that has the field.

This is useful, e.g., with collections that have large fields which change rarely and small fields that change more frequently. The smaller fields may be re-indexed in a new index and both indexes may be searched together.

Warning: It is up to you to make sure all indexes are created and modified the same way. For example, if you add documents to one index, you need to add the same documents in the same order to the other indexes. Failure to do so will result in undefined behavior.

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.lucene.index.IndexReader
`IndexReader.FieldOption`

Field Summary

Fields inherited from class org.apache.lucene.index.IndexReader
`hasChanges`

Constructor Summary
`ParallelReader()` Construct a ParallelReader.
`ParallelReader(boolean closeSubReaders)` Construct a ParallelReader.

Method Summary
`void`	`add(IndexReader reader)` Add an IndexReader.
`void`	`add(IndexReader reader, boolean ignoreStoredFields)` Add an IndexReader whose stored fields will not be returned.
`Object`	`clone()` Efficiently clones the IndexReader (sharing most internal state).
`int`	`docFreq(Term term)` Returns the number of documents containing the term `t`.
`protected void`	`doClose()` Implements close.
`protected void`	`doCommit(Map<String,String> commitUserData)` Implements commit.
`Document`	`document(int n, FieldSelector fieldSelector)` Get the `Document` at the `n` ^th position.
`protected void`	`doDelete(int n)` Implements deletion of the document numbered `docNum`.
`protected IndexReader`	`doReopen(boolean doClone)`
`protected void`	`doSetNorm(int n, String field, byte value)` Implements setNorm in subclass.
`protected void`	`doUndeleteAll()` Implements actual undeleteAll() in subclass.
`Collection<String>`	`getFieldNames(IndexReader.FieldOption fieldNames)` Get a list of unique field names that exist in this index and have the specified field option information.
`TermFreqVector`	`getTermFreqVector(int n, String field)` Return a term frequency vector for the specified document and field.
`void`	`getTermFreqVector(int docNumber, String field, TermVectorMapper mapper)` Load the Term Vector into a user-defined data structure instead of relying on the parallel arrays of the `TermFreqVector`.
`void`	`getTermFreqVector(int docNumber, TermVectorMapper mapper)` Map all the term vectors for all fields in a Document
`TermFreqVector[]`	`getTermFreqVectors(int n)` Return an array of term frequency vectors for the specified document.
`long`	`getVersion()` Not implemented.
`boolean`	`hasDeletions()` Returns true if any documents have been deleted
`boolean`	`hasNorms(String field)` Returns true if there are norms stored for this field.
`boolean`	`isCurrent()` Checks recursively if all subreaders are up to date.
`boolean`	`isDeleted(int n)` Returns true if document n has been deleted
`boolean`	`isOptimized()` Checks recursively if all subindexes are optimized
`int`	`maxDoc()` Returns one greater than the largest possible document number.
`byte[]`	`norms(String field)` Returns the byte-encoded normalization factor for the named field of every document.
`void`	`norms(String field, byte[] result, int offset)` Reads the byte-encoded normalization factor for the named field of every document.
`int`	`numDocs()` Returns the number of documents in this index.
`IndexReader`	`reopen()` Tries to reopen the subreaders.
`TermDocs`	`termDocs()` Returns an unpositioned `TermDocs` enumerator.
`TermDocs`	`termDocs(Term term)` Returns an enumeration of all the documents which contain `term`.
`TermPositions`	`termPositions()` Returns an unpositioned `TermPositions` enumerator.
`TermPositions`	`termPositions(Term term)` Returns an enumeration of all the documents which contain `term`.
`TermEnum`	`terms()` Returns an enumeration of all the terms in the index.
`TermEnum`	`terms(Term term)` Returns an enumeration of all terms starting at a given term.

Methods inherited from class org.apache.lucene.index.IndexReader
`acquireWriteLock, clone, close, commit, commit, decRef, deleteDocument, deleteDocuments, directory, document, ensureOpen, flush, flush, getCommitUserData, getCommitUserData, getCurrentVersion, getDeletesCacheKey, getFieldCacheKey, getIndexCommit, getRefCount, getSequentialSubReaders, getTermInfosIndexDivisor, getUniqueTermCount, incRef, indexExists, lastModified, listCommits, main, numDeletedDocs, open, open, open, open, open, open, open, reopen, reopen, setNorm, setNorm, undeleteAll`

Methods inherited from class org.apache.lucene.index.IndexReader

acquireWriteLock, clone, close, commit, commit, decRef, deleteDocument, deleteDocuments, directory, document, ensureOpen, flush, flush, getCommitUserData, getCommitUserData, getCurrentVersion, getDeletesCacheKey, getFieldCacheKey, getIndexCommit, getRefCount, getSequentialSubReaders, getTermInfosIndexDivisor, getUniqueTermCount, incRef, indexExists, lastModified, listCommits, main, numDeletedDocs, open, open, open, open, open, open, open, reopen, reopen, setNorm, setNorm, undeleteAll

Methods inherited from class java.lang.Object
`equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Constructor Detail

ParallelReader

public ParallelReader()
               throws IOException

Construct a ParallelReader.

Note that all subreaders are closed if this ParallelReader is closed.

Throws:: IOException

ParallelReader

public ParallelReader(boolean closeSubReaders)
               throws IOException

Construct a ParallelReader.

Parameters:: closeSubReaders - indicates whether the subreaders should be closed when this ParallelReader is closed
Throws:: IOException

Method Detail

add

public void add(IndexReader reader)
         throws IOException

Add an IndexReader.

Throws:: IOException - if there is a low-level IO error

add

public void add(IndexReader reader,
                boolean ignoreStoredFields)
         throws IOException

Add an IndexReader whose stored fields will not be returned. This can accelerate search when stored fields are only needed from a subset of the IndexReaders.

Throws:: IllegalArgumentException - if not all indexes contain the same number of documents; IllegalArgumentException - if not all indexes have the same value of IndexReader.maxDoc(); IOException - if there is a low-level IO error

clone

public Object clone()

Description copied from class: IndexReader

Efficiently clones the IndexReader (sharing most internal state).

On cloning a reader with pending changes (deletions, norms), the original reader transfers its write lock to the cloned reader. This means only the cloned reader may make further changes to the index, and commit the changes to the index on close, but the old reader still reflects all changes made up until it was cloned.

Like IndexReader.reopen(), it's safe to make changes to either the original or the cloned reader: all shared mutable state obeys "copy on write" semantics to ensure the changes are not seen by other readers.

Overrides:: clone in class IndexReader

reopen

public IndexReader reopen()
                   throws CorruptIndexException,
                          IOException

Tries to reopen the subreaders.
If one or more subreaders could be re-opened (i. e. subReader.reopen() returned a new instance != subReader), then a new ParallelReader instance is returned, otherwise this instance is returned.

A re-opened instance might share one or more subreaders with the old instance. Index modification operations result in undefined behavior when performed before the old instance is closed. (see IndexReader.reopen()).

If subreaders are shared, then the reference count of those readers is increased to ensure that the subreaders remain open until the last referring reader is closed.

Overrides:: reopen in class IndexReader

Throws:: CorruptIndexException - if the index is corrupt; IOException - if there is a low-level IO error

doReopen

protected IndexReader doReopen(boolean doClone)
                        throws CorruptIndexException,
                               IOException

Throws:: CorruptIndexException; IOException

numDocs

public int numDocs()

Description copied from class: IndexReader

Returns the number of documents in this index.

Specified by:: numDocs in class IndexReader

maxDoc

public int maxDoc()

Description copied from class: IndexReader

Returns one greater than the largest possible document number. This may be used to, e.g., determine how big to allocate an array which will have an element for every document number in an index.

Specified by:: maxDoc in class IndexReader

hasDeletions

public boolean hasDeletions()

Description copied from class: IndexReader

Returns true if any documents have been deleted

Specified by:: hasDeletions in class IndexReader

isDeleted

public boolean isDeleted(int n)

Description copied from class: IndexReader

Returns true if document n has been deleted

Specified by:: isDeleted in class IndexReader

doDelete

protected void doDelete(int n)
                 throws CorruptIndexException,
                        IOException

Description copied from class: IndexReader

Implements deletion of the document numbered docNum. Applications should call IndexReader.deleteDocument(int) or IndexReader.deleteDocuments(Term).

Specified by:: doDelete in class IndexReader

Throws:: CorruptIndexException; IOException

doUndeleteAll

protected void doUndeleteAll()
                      throws CorruptIndexException,
                             IOException

Description copied from class: IndexReader

Implements actual undeleteAll() in subclass.

Specified by:: doUndeleteAll in class IndexReader

Throws:: CorruptIndexException; IOException

document

public Document document(int n,
                         FieldSelector fieldSelector)
                  throws CorruptIndexException,
                         IOException

Description copied from class: IndexReader

Get the Document at the n ^th position. The FieldSelector may be used to determine what Fields to load and how they should be loaded. NOTE: If this Reader (more specifically, the underlying FieldsReader) is closed before the lazy Field is loaded an exception may be thrown. If you want the value of a lazy Field to be available after closing you must explicitly load it or fetch the Document again with a new loader.

NOTE: for performance reasons, this method does not check if the requested document is deleted, and therefore asking for a deleted document may yield unspecified results. Usually this is not required, however you can call IndexReader.isDeleted(int) with the requested document ID to verify the document is not deleted.

Specified by:: document in class IndexReader

Parameters:: n - Get the document at the n^th position; fieldSelector - The FieldSelector to use to determine what Fields should be loaded on the Document. May be null, in which case all Fields will be loaded.
Returns:: The stored fields of the Document at the nth position
Throws:: CorruptIndexException - if the index is corrupt; IOException - if there is a low-level IO error
See Also:: Fieldable, FieldSelector, SetBasedFieldSelector, LoadFirstFieldSelector

getTermFreqVectors

public TermFreqVector[] getTermFreqVectors(int n)
                                    throws IOException

Description copied from class: IndexReader

Return an array of term frequency vectors for the specified document. The array contains a vector for each vectorized field in the document. Each vector contains terms and frequencies for all terms in a given vectorized field. If no such fields existed, the method returns null. The term vectors that are returned may either be of type TermFreqVector or of type TermPositionVector if positions or offsets have been stored.

Specified by:: getTermFreqVectors in class IndexReader

Parameters:: n - document for which term frequency vectors are returned
Returns:: array of term frequency vectors. May be null if no term vectors have been stored for the specified document.
Throws:: IOException - if index cannot be accessed
See Also:: Field.TermVector

getTermFreqVector

public TermFreqVector getTermFreqVector(int n,
                                        String field)
                                 throws IOException

Description copied from class: IndexReader

Return a term frequency vector for the specified document and field. The returned vector contains terms and frequencies for the terms in the specified field of this document, if the field had the storeTermVector flag set. If termvectors had been stored with positions or offsets, a TermPositionVector is returned.

Specified by:: getTermFreqVector in class IndexReader

Parameters:: n - document for which the term frequency vector is returned; field - field for which the term frequency vector is returned.
Returns:: term frequency vector May be null if field does not exist in the specified document or term vector was not stored.
Throws:: IOException - if index cannot be accessed
See Also:: Field.TermVector

getTermFreqVector

public void getTermFreqVector(int docNumber,
                              String field,
                              TermVectorMapper mapper)
                       throws IOException

Description copied from class: IndexReader

Load the Term Vector into a user-defined data structure instead of relying on the parallel arrays of the TermFreqVector.

Specified by:: getTermFreqVector in class IndexReader

Parameters:: docNumber - The number of the document to load the vector for; field - The name of the field to load; mapper - The TermVectorMapper to process the vector. Must not be null
Throws:: IOException - if term vectors cannot be accessed or if they do not exist on the field and doc. specified.

getTermFreqVector

public void getTermFreqVector(int docNumber,
                              TermVectorMapper mapper)
                       throws IOException

Description copied from class: IndexReader

Map all the term vectors for all fields in a Document

Specified by:: getTermFreqVector in class IndexReader

Parameters:: docNumber - The number of the document to load the vector for; mapper - The TermVectorMapper to process the vector. Must not be null
Throws:: IOException - if term vectors cannot be accessed or if they do not exist on the field and doc. specified.

hasNorms

public boolean hasNorms(String field)
                 throws IOException

Description copied from class: IndexReader

Returns true if there are norms stored for this field.

Overrides:: hasNorms in class IndexReader

Throws:: IOException

norms

public byte[] norms(String field)
             throws IOException

Description copied from class: IndexReader

Returns the byte-encoded normalization factor for the named field of every document. This is used by the search code to score documents.

Specified by:: norms in class IndexReader

Throws:: IOException
See Also:: AbstractField.setBoost(float)

norms

public void norms(String field,
                  byte[] result,
                  int offset)
           throws IOException

Description copied from class: IndexReader

Reads the byte-encoded normalization factor for the named field of every document. This is used by the search code to score documents.

Specified by:: norms in class IndexReader

Throws:: IOException
See Also:: AbstractField.setBoost(float)

doSetNorm

protected void doSetNorm(int n,
                         String field,
                         byte value)
                  throws CorruptIndexException,
                         IOException

Description copied from class: IndexReader

Implements setNorm in subclass.

Specified by:: doSetNorm in class IndexReader

Throws:: CorruptIndexException; IOException

terms

public TermEnum terms()
               throws IOException

Description copied from class: IndexReader

Returns an enumeration of all the terms in the index. The enumeration is ordered by Term.compareTo(). Each term is greater than all that precede it in the enumeration. Note that after calling terms(), TermEnum.next() must be called on the resulting enumeration before calling other methods such as TermEnum.term().

Specified by:: terms in class IndexReader

Throws:: IOException - if there is a low-level IO error

terms

public TermEnum terms(Term term)
               throws IOException

Description copied from class: IndexReader

Returns an enumeration of all terms starting at a given term. If the given term does not exist, the enumeration is positioned at the first term greater than the supplied term. The enumeration is ordered by Term.compareTo(). Each term is greater than all that precede it in the enumeration.

Specified by:: terms in class IndexReader

Throws:: IOException - if there is a low-level IO error

docFreq

public int docFreq(Term term)
            throws IOException

Description copied from class: IndexReader

Returns the number of documents containing the term t.

Specified by:: docFreq in class IndexReader

Throws:: IOException - if there is a low-level IO error

termDocs

public TermDocs termDocs(Term term)
                  throws IOException

Description copied from class: IndexReader

Returns an enumeration of all the documents which contain term. For each document, the document number, the frequency of the term in that document is also provided, for use in search scoring. If term is null, then all non-deleted docs are returned with freq=1. Thus, this method implements the mapping:

The enumeration is ordered by document number. Each document number is greater than all that precede it in the enumeration.

Overrides:: termDocs in class IndexReader

Throws:: IOException - if there is a low-level IO error

termDocs

public TermDocs termDocs()
                  throws IOException

Description copied from class: IndexReader

Returns an unpositioned TermDocs enumerator.

Specified by:: termDocs in class IndexReader

Throws:: IOException - if there is a low-level IO error

termPositions

public TermPositions termPositions(Term term)
                            throws IOException

Description copied from class: IndexReader

Returns an enumeration of all the documents which contain term. For each document, in addition to the document number and frequency of the term in that document, a list of all of the ordinal positions of the term in the document is available. Thus, this method implements the mapping:

₁

₂

_freq-1

This positional information facilitates phrase and proximity searching.

The enumeration is ordered by document number. Each document number is greater than all that precede it in the enumeration.

Overrides:: termPositions in class IndexReader

Throws:: IOException - if there is a low-level IO error

termPositions

public TermPositions termPositions()
                            throws IOException

Description copied from class: IndexReader

Returns an unpositioned TermPositions enumerator.

Specified by:: termPositions in class IndexReader

Throws:: IOException - if there is a low-level IO error

isCurrent

public boolean isCurrent()
                  throws CorruptIndexException,
                         IOException

Checks recursively if all subreaders are up to date.

Overrides:: isCurrent in class IndexReader

Throws:: CorruptIndexException - if the index is corrupt; IOException - if there is a low-level IO error

isOptimized

public boolean isOptimized()

Checks recursively if all subindexes are optimized

Overrides:: isOptimized in class IndexReader

Returns:: true if the index is optimized; false otherwise

getVersion

public long getVersion()

Not implemented.

Overrides:: getVersion in class IndexReader

Throws:: UnsupportedOperationException

doCommit

protected void doCommit(Map<String,String> commitUserData)
                 throws IOException

Description copied from class: IndexReader

Implements commit.

Specified by:: doCommit in class IndexReader

Throws:: IOException

doClose

protected void doClose()
                throws IOException

Description copied from class: IndexReader

Implements close.

Specified by:: doClose in class IndexReader

Throws:: IOException

getFieldNames

public Collection<String> getFieldNames(IndexReader.FieldOption fieldNames)

Description copied from class: IndexReader

Get a list of unique field names that exist in this index and have the specified field option information.

Specified by:: getFieldNames in class IndexReader

Parameters:: fieldNames - specifies which field option should be available for the returned fields
Returns:: Collection of Strings indicating the names of the fields.
See Also:: IndexReader.FieldOption

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.lucene.index Class ParallelReader

ParallelReader

ParallelReader

add

add

clone

reopen

doReopen

numDocs

maxDoc

hasDeletions

isDeleted

doDelete

doUndeleteAll

document

getTermFreqVectors

getTermFreqVector

getTermFreqVector

getTermFreqVector

hasNorms

norms

norms

doSetNorm

terms

terms

docFreq

termDocs

termDocs

termPositions

termPositions

isCurrent

isOptimized

getVersion

doCommit

doClose

getFieldNames

org.apache.lucene.index
Class ParallelReader