public class FuzzyTermsEnum extends TermsEnum
Term enumerations are always ordered by
getComparator(). Each term in the enumeration is
greater than all that precede it.
| Modifier and Type | Class and Description |
|---|---|
static interface |
FuzzyTermsEnum.LevenshteinAutomataAttribute
reuses compiled automata across different segments,
because they are independent of the index
|
static class |
FuzzyTermsEnum.LevenshteinAutomataAttributeImpl
Stores compiled automata as a list (indexed by edit distance)
|
TermsEnum.SeekStatus| Modifier and Type | Field and Description |
|---|---|
protected int |
maxEdits |
protected float |
minSimilarity |
protected boolean |
raw |
protected int |
realPrefixLength |
protected float |
scale_factor |
protected int |
termLength |
protected Terms |
terms |
protected int[] |
termText |
| Constructor and Description |
|---|
FuzzyTermsEnum(Terms terms,
AttributeSource atts,
Term term,
float minSimilarity,
int prefixLength,
boolean transpositions)
Constructor for enumeration of all terms from specified
reader which share a prefix of
length prefixLength with term and which have a fuzzy similarity >
minSimilarity. |
| Modifier and Type | Method and Description |
|---|---|
int |
docFreq()
Returns the number of documents containing the current
term.
|
DocsEnum |
docs(Bits liveDocs,
DocsEnum reuse,
int flags)
Get
DocsEnum for the current term, with
control over whether freqs are required. |
DocsAndPositionsEnum |
docsAndPositions(Bits liveDocs,
DocsAndPositionsEnum reuse,
int flags)
Get
DocsAndPositionsEnum for the current term,
with control over whether offsets and payloads are
required. |
protected TermsEnum |
getAutomatonEnum(int editDistance,
BytesRef lastTerm)
return an automata-based enum for matching up to editDistance from
lastTerm, if possible
|
Comparator<BytesRef> |
getComparator()
Return the
BytesRef Comparator used to sort terms provided by the
iterator. |
float |
getMinSimilarity() |
float |
getScaleFactor() |
protected void |
maxEditDistanceChanged(BytesRef lastTerm,
int maxEdits,
boolean init) |
BytesRef |
next()
Increments the iteration to the next
BytesRef in the iterator. |
long |
ord()
Returns ordinal position for current term.
|
TermsEnum.SeekStatus |
seekCeil(BytesRef text)
Seeks to the specified term, if it exists, or to the
next (ceiling) term.
|
boolean |
seekExact(BytesRef text)
Attempts to seek to the exact term, returning
true if the term is found.
|
void |
seekExact(BytesRef term,
TermState state)
Expert: Seeks a specific position by
TermState previously obtained
from TermsEnum.termState(). |
void |
seekExact(long ord)
Seeks to the specified term by ordinal (position) as
previously returned by
TermsEnum.ord(). |
protected void |
setEnum(TermsEnum actualEnum)
swap in a new actual enum to proxy to
|
BytesRef |
term()
Returns current term.
|
TermState |
termState()
Expert: Returns the TermsEnums internal state to position the TermsEnum
without re-seeking the term dictionary.
|
long |
totalTermFreq()
Returns the total number of occurrences of this term
across all documents (the sum of the freq() for each
doc that has this term).
|
attributes, docs, docsAndPositionsprotected final float minSimilarity
protected final float scale_factor
protected final int termLength
protected int maxEdits
protected final boolean raw
protected final Terms terms
protected final int[] termText
protected final int realPrefixLength
public FuzzyTermsEnum(Terms terms, AttributeSource atts, Term term, float minSimilarity, int prefixLength, boolean transpositions) throws IOException
reader which share a prefix of
length prefixLength with term and which have a fuzzy similarity >
minSimilarity.
After calling the constructor the enumeration is already pointing to the first valid term if such a term exists.
terms - Delivers terms.atts - AttributeSource created by the rewrite method of MultiTermQuery
thats contains information about competitive boosts during rewrite. It is also used
to cache DFAs between segment transitions.term - Pattern term.minSimilarity - Minimum required similarity for terms from the reader. Pass an integer value
representing edit distance. Passing a fraction is deprecated.prefixLength - Length of required common prefix. Default value is 0.IOException - if there is a low-level IO errorprotected TermsEnum getAutomatonEnum(int editDistance, BytesRef lastTerm) throws IOException
IOExceptionprotected void setEnum(TermsEnum actualEnum)
protected void maxEditDistanceChanged(BytesRef lastTerm, int maxEdits, boolean init) throws IOException
IOExceptionpublic BytesRef next() throws IOException
BytesRefIteratorBytesRef in the iterator.
Returns the resulting BytesRef or null if the end of
the iterator is reached. The returned BytesRef may be re-used across calls
to next. After this method returns null, do not call it again: the results
are undefined.BytesRef in the iterator or null if
the end of the iterator is reached.IOException - If there is a low-level I/O error.public int docFreq()
throws IOException
TermsEnumTermsEnum.SeekStatus.END.docFreq in class TermsEnumIOExceptionpublic long totalTermFreq()
throws IOException
TermsEnumtotalTermFreq in class TermsEnumIOExceptionpublic DocsEnum docs(Bits liveDocs, DocsEnum reuse, int flags) throws IOException
TermsEnumDocsEnum for the current term, with
control over whether freqs are required. Do not
call this when the enum is unpositioned. This method
will not return null.docs in class TermsEnumliveDocs - unset bits are documents that should not
be returnedreuse - pass a prior DocsEnum for possible reuseflags - specifies which optional per-document values
you require; see DocsEnum.FLAG_FREQSIOExceptionTermsEnum.docs(Bits, DocsEnum, int)public DocsAndPositionsEnum docsAndPositions(Bits liveDocs, DocsAndPositionsEnum reuse, int flags) throws IOException
TermsEnumDocsAndPositionsEnum for the current term,
with control over whether offsets and payloads are
required. Some codecs may be able to optimize their
implementation when offsets and/or payloads are not required.
Do not call this when the enum is unpositioned. This
will return null if positions were not indexed.docsAndPositions in class TermsEnumliveDocs - unset bits are documents that should not
be returnedreuse - pass a prior DocsAndPositionsEnum for possible reuseflags - specifies which optional per-position values you
require; see DocsAndPositionsEnum.FLAG_OFFSETS and
DocsAndPositionsEnum.FLAG_PAYLOADS.IOExceptionpublic void seekExact(BytesRef term, TermState state) throws IOException
TermsEnumTermState previously obtained
from TermsEnum.termState(). Callers should maintain the TermState to
use this method. Low-level implementations may position the TermsEnum
without re-seeking the term dictionary.
Seeking by TermState should only be used iff the state was obtained
from the same TermsEnum instance.
NOTE: Using this method with an incompatible TermState might leave
this TermsEnum in undefined state. On a segment level
TermState instances are compatible only iff the source and the
target TermsEnum operate on the same field. If operating on segment
level, TermState instances must not be used across segments.
NOTE: A seek by TermState might not restore the
AttributeSource's state. AttributeSource states must be
maintained separately if this method is used.
seekExact in class TermsEnumterm - the term the TermState corresponds tostate - the TermStateIOExceptionpublic TermState termState() throws IOException
TermsEnum
NOTE: A seek by TermState might not capture the
AttributeSource's state. Callers must maintain the
AttributeSource states separately
termState in class TermsEnumIOExceptionTermState,
TermsEnum.seekExact(BytesRef, TermState)public Comparator<BytesRef> getComparator()
BytesRefIteratorBytesRef Comparator used to sort terms provided by the
iterator. This may return null if there are no items or the iterator is not
sorted. Callers may invoke this method many times, so it's best to cache a
single instance & reuse it.public long ord()
throws IOException
TermsEnumUnsupportedOperationException). Do not call this
when the enum is unpositioned.ord in class TermsEnumIOExceptionpublic boolean seekExact(BytesRef text) throws IOException
TermsEnumTermsEnum.seekCeil(org.apache.lucene.util.BytesRef).seekExact in class TermsEnumIOExceptionpublic TermsEnum.SeekStatus seekCeil(BytesRef text) throws IOException
TermsEnumseekCeil in class TermsEnumIOExceptionpublic void seekExact(long ord)
throws IOException
TermsEnumTermsEnum.ord(). The target ord
may be before or after the current ord, and must be
within bounds.seekExact in class TermsEnumIOExceptionpublic BytesRef term() throws IOException
TermsEnumterm in class TermsEnumIOExceptionpublic float getMinSimilarity()
public float getScaleFactor()
Copyright © 2000-2014 Apache Software Foundation. All Rights Reserved.