public final class NGramTokenFilter extends TokenFilter
You must specify the required Version compatibility when
creating a NGramTokenFilter. As of Lucene 4.4, this token filters:
You can make this filter use the old behavior by providing a version <
Version.LUCENE_44 in the constructor but this is not recommended as
it will lead to broken TokenStreams that will cause highlighting
bugs.
If you were using this TokenFilter to perform partial highlighting,
this won't work anymore since this filter doesn't update offsets. You should
modify your analysis chain to use NGramTokenizer, and potentially
override NGramTokenizer.isTokenChar(int) to perform pre-tokenization.
AttributeSource.AttributeFactory, AttributeSource.State| Modifier and Type | Field and Description |
|---|---|
static int |
DEFAULT_MAX_NGRAM_SIZE |
static int |
DEFAULT_MIN_NGRAM_SIZE |
input| Constructor and Description |
|---|
NGramTokenFilter(Version version,
TokenStream input)
Creates NGramTokenFilter with default min and max n-grams.
|
NGramTokenFilter(Version version,
TokenStream input,
int minGram,
int maxGram)
Creates NGramTokenFilter with given min and max n-grams.
|
| Modifier and Type | Method and Description |
|---|---|
boolean |
incrementToken()
Returns the next token in the stream, or null at EOS.
|
void |
reset() |
close, endaddAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toStringpublic static final int DEFAULT_MIN_NGRAM_SIZE
public static final int DEFAULT_MAX_NGRAM_SIZE
public NGramTokenFilter(Version version, TokenStream input, int minGram, int maxGram)
version - Lucene version to enable correct position increments.
See above for details.input - TokenStream holding the input to be tokenizedminGram - the smallest n-gram to generatemaxGram - the largest n-gram to generatepublic NGramTokenFilter(Version version, TokenStream input)
version - Lucene version to enable correct position increments.
See above for details.input - TokenStream holding the input to be tokenizedpublic final boolean incrementToken()
throws IOException
incrementToken in class TokenStreamIOExceptionpublic void reset()
throws IOException
reset in class TokenFilterIOExceptionCopyright © 2000-2013 Apache Software Foundation. All Rights Reserved.