org.apache.lucene.index
Class TermsEnum

java.lang.Object
  extended by org.apache.lucene.index.TermsEnum
All Implemented Interfaces:
BytesRefIterator
Direct Known Subclasses:
FilterAtomicReader.FilterTermsEnum, FilteredTermsEnum, FuzzyTermsEnum, MultiTermsEnum

public abstract class TermsEnum
extends Object
implements BytesRefIterator

Iterator to seek (seekCeil(BytesRef), seekExact(BytesRef,boolean)) or step through (BytesRefIterator.next() terms to obtain frequency information (docFreq()), DocsEnum or DocsAndPositionsEnum for the current term (docs(org.apache.lucene.util.Bits, org.apache.lucene.index.DocsEnum).

Term enumerations are always ordered by BytesRefIterator.getComparator(). Each term in the enumeration is greater than the one before it.

The TermsEnum is unpositioned when you first obtain it and you must first successfully call BytesRefIterator.next() or one of the seek methods.

WARNING: This API is experimental and might change in incompatible ways in the next release.

Nested Class Summary
static class TermsEnum.SeekStatus
          Represents returned result from seekCeil(org.apache.lucene.util.BytesRef, boolean).
 
Field Summary
static TermsEnum EMPTY
          An empty TermsEnum for quickly returning an empty instance e.g.
 
Constructor Summary
protected TermsEnum()
          Sole constructor.
 
Method Summary
 AttributeSource attributes()
          Returns the related attributes.
abstract  int docFreq()
          Returns the number of documents containing the current term.
 DocsEnum docs(Bits liveDocs, DocsEnum reuse)
          Get DocsEnum for the current term.
abstract  DocsEnum docs(Bits liveDocs, DocsEnum reuse, int flags)
          Get DocsEnum for the current term, with control over whether freqs are required.
 DocsAndPositionsEnum docsAndPositions(Bits liveDocs, DocsAndPositionsEnum reuse)
          Get DocsAndPositionsEnum for the current term.
abstract  DocsAndPositionsEnum docsAndPositions(Bits liveDocs, DocsAndPositionsEnum reuse, int flags)
          Get DocsAndPositionsEnum for the current term, with control over whether offsets and payloads are required.
abstract  long ord()
          Returns ordinal position for current term.
 TermsEnum.SeekStatus seekCeil(BytesRef text)
          Seeks to the specified term, if it exists, or to the next (ceiling) term.
abstract  TermsEnum.SeekStatus seekCeil(BytesRef text, boolean useCache)
          Expert: just like seekCeil(BytesRef) but allows you to control whether the implementation should attempt to use its term cache (if it uses one).
 boolean seekExact(BytesRef text, boolean useCache)
          Attempts to seek to the exact term, returning true if the term is found.
 void seekExact(BytesRef term, TermState state)
          Expert: Seeks a specific position by TermState previously obtained from termState().
abstract  void seekExact(long ord)
          Seeks to the specified term by ordinal (position) as previously returned by ord().
abstract  BytesRef term()
          Returns current term.
 TermState termState()
          Expert: Returns the TermsEnums internal state to position the TermsEnum without re-seeking the term dictionary.
abstract  long totalTermFreq()
          Returns the total number of occurrences of this term across all documents (the sum of the freq() for each doc that has this term).
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.apache.lucene.util.BytesRefIterator
getComparator, next
 

Field Detail

EMPTY

public static final TermsEnum EMPTY
An empty TermsEnum for quickly returning an empty instance e.g. in MultiTermQuery

Please note: This enum should be unmodifiable, but it is currently possible to add Attributes to it. This should not be a problem, as the enum is always empty and the existence of unused Attributes does not matter.

Constructor Detail

TermsEnum

protected TermsEnum()
Sole constructor. (For invocation by subclass constructors, typically implicit.)

Method Detail

attributes

public AttributeSource attributes()
Returns the related attributes.


seekExact

public boolean seekExact(BytesRef text,
                         boolean useCache)
                  throws IOException
Attempts to seek to the exact term, returning true if the term is found. If this returns false, the enum is unpositioned. For some codecs, seekExact may be substantially faster than seekCeil(org.apache.lucene.util.BytesRef, boolean).

Throws:
IOException

seekCeil

public abstract TermsEnum.SeekStatus seekCeil(BytesRef text,
                                              boolean useCache)
                                       throws IOException
Expert: just like seekCeil(BytesRef) but allows you to control whether the implementation should attempt to use its term cache (if it uses one).

Throws:
IOException

seekCeil

public final TermsEnum.SeekStatus seekCeil(BytesRef text)
                                    throws IOException
Seeks to the specified term, if it exists, or to the next (ceiling) term. Returns SeekStatus to indicate whether exact term was found, a different term was found, or EOF was hit. The target term may be before or after the current term. If this returns SeekStatus.END, the enum is unpositioned.

Throws:
IOException

seekExact

public abstract void seekExact(long ord)
                        throws IOException
Seeks to the specified term by ordinal (position) as previously returned by ord(). The target ord may be before or after the current ord, and must be within bounds.

Throws:
IOException

seekExact

public void seekExact(BytesRef term,
                      TermState state)
               throws IOException
Expert: Seeks a specific position by TermState previously obtained from termState(). Callers should maintain the TermState to use this method. Low-level implementations may position the TermsEnum without re-seeking the term dictionary.

Seeking by TermState should only be used iff the enum the state was obtained from and the enum the state is used for seeking are obtained from the same IndexReader.

NOTE: Using this method with an incompatible TermState might leave this TermsEnum in undefined state. On a segment level TermState instances are compatible only iff the source and the target TermsEnum operate on the same field. If operating on segment level, TermState instances must not be used across segments.

NOTE: A seek by TermState might not restore the AttributeSource's state. AttributeSource states must be maintained separately if this method is used.

Parameters:
term - the term the TermState corresponds to
state - the TermState
Throws:
IOException

term

public abstract BytesRef term()
                       throws IOException
Returns current term. Do not call this when the enum is unpositioned.

Throws:
IOException

ord

public abstract long ord()
                  throws IOException
Returns ordinal position for current term. This is an optional method (the codec may throw UnsupportedOperationException). Do not call this when the enum is unpositioned.

Throws:
IOException

docFreq

public abstract int docFreq()
                     throws IOException
Returns the number of documents containing the current term. Do not call this when the enum is unpositioned. TermsEnum.SeekStatus.END.

Throws:
IOException

totalTermFreq

public abstract long totalTermFreq()
                            throws IOException
Returns the total number of occurrences of this term across all documents (the sum of the freq() for each doc that has this term). This will be -1 if the codec doesn't support this measure. Note that, like other term measures, this measure does not take deleted documents into account.

Throws:
IOException

docs

public final DocsEnum docs(Bits liveDocs,
                           DocsEnum reuse)
                    throws IOException
Get DocsEnum for the current term. Do not call this when the enum is unpositioned. This method will not return null.

Parameters:
liveDocs - unset bits are documents that should not be returned
reuse - pass a prior DocsEnum for possible reuse
Throws:
IOException

docs

public abstract DocsEnum docs(Bits liveDocs,
                              DocsEnum reuse,
                              int flags)
                       throws IOException
Get DocsEnum for the current term, with control over whether freqs are required. Do not call this when the enum is unpositioned. This method will not return null.

Parameters:
liveDocs - unset bits are documents that should not be returned
reuse - pass a prior DocsEnum for possible reuse
flags - specifies which optional per-document values you require; see DocsEnum.FLAG_FREQS
Throws:
IOException
See Also:
docs(Bits, DocsEnum, int)

docsAndPositions

public final DocsAndPositionsEnum docsAndPositions(Bits liveDocs,
                                                   DocsAndPositionsEnum reuse)
                                            throws IOException
Get DocsAndPositionsEnum for the current term. Do not call this when the enum is unpositioned. This method will return null if positions were not indexed.

Parameters:
liveDocs - unset bits are documents that should not be returned
reuse - pass a prior DocsAndPositionsEnum for possible reuse
Throws:
IOException
See Also:
docsAndPositions(Bits, DocsAndPositionsEnum, int)

docsAndPositions

public abstract DocsAndPositionsEnum docsAndPositions(Bits liveDocs,
                                                      DocsAndPositionsEnum reuse,
                                                      int flags)
                                               throws IOException
Get DocsAndPositionsEnum for the current term, with control over whether offsets and payloads are required. Some codecs may be able to optimize their implementation when offsets and/or payloads are not required. Do not call this when the enum is unpositioned. This will return null if positions were not indexed.

Parameters:
liveDocs - unset bits are documents that should not be returned
reuse - pass a prior DocsAndPositionsEnum for possible reuse
flags - specifies which optional per-position values you require; see DocsAndPositionsEnum.FLAG_OFFSETS and DocsAndPositionsEnum.FLAG_PAYLOADS.
Throws:
IOException

termState

public TermState termState()
                    throws IOException
Expert: Returns the TermsEnums internal state to position the TermsEnum without re-seeking the term dictionary.

NOTE: A seek by TermState might not capture the AttributeSource's state. Callers must maintain the AttributeSource states separately

Throws:
IOException
See Also:
TermState, seekExact(BytesRef, TermState)


Copyright © 2000-2013 Apache Software Foundation. All Rights Reserved.