public final class NGramTokenFilter extends TokenFilter
You must specify the required Version
compatibility when
creating a NGramTokenFilter
. As of Lucene 4.4, this token filters:
You can make this filter use the old behavior by providing a version <
Version.LUCENE_44
in the constructor but this is not recommended as
it will lead to broken TokenStream
s that will cause highlighting
bugs.
If you were using this TokenFilter
to perform partial highlighting,
this won't work anymore since this filter doesn't update offsets. You should
modify your analysis chain to use NGramTokenizer
, and potentially
override NGramTokenizer.isTokenChar(int)
to perform pre-tokenization.
AttributeSource.AttributeFactory, AttributeSource.State
Modifier and Type | Field and Description |
---|---|
static int |
DEFAULT_MAX_NGRAM_SIZE |
static int |
DEFAULT_MIN_NGRAM_SIZE |
input
Constructor and Description |
---|
NGramTokenFilter(Version version,
TokenStream input)
Creates NGramTokenFilter with default min and max n-grams.
|
NGramTokenFilter(Version version,
TokenStream input,
int minGram,
int maxGram)
Creates NGramTokenFilter with given min and max n-grams.
|
Modifier and Type | Method and Description |
---|---|
boolean |
incrementToken()
Returns the next token in the stream, or null at EOS.
|
void |
reset() |
close, end
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toString
public static final int DEFAULT_MIN_NGRAM_SIZE
public static final int DEFAULT_MAX_NGRAM_SIZE
public NGramTokenFilter(Version version, TokenStream input, int minGram, int maxGram)
version
- Lucene version to enable correct position increments.
See above for details.input
- TokenStream
holding the input to be tokenizedminGram
- the smallest n-gram to generatemaxGram
- the largest n-gram to generatepublic NGramTokenFilter(Version version, TokenStream input)
version
- Lucene version to enable correct position increments.
See above for details.input
- TokenStream
holding the input to be tokenizedpublic final boolean incrementToken() throws IOException
incrementToken
in class TokenStream
IOException
public void reset() throws IOException
reset
in class TokenFilter
IOException
Copyright © 2000-2014 Apache Software Foundation. All Rights Reserved.