Package org.apache.lucene.analysis.ngram
Class EdgeNGramTokenFilter
- java.lang.Object
-
- org.apache.lucene.util.AttributeSource
-
- org.apache.lucene.analysis.TokenStream
-
- org.apache.lucene.analysis.TokenFilter
-
- org.apache.lucene.analysis.ngram.EdgeNGramTokenFilter
-
- All Implemented Interfaces:
Closeable
,AutoCloseable
public final class EdgeNGramTokenFilter extends TokenFilter
Tokenizes the given token into n-grams of given size(s).This
TokenFilter
create n-grams from the beginning edge of a input token.As of Lucene 4.4, this filter handles correctly supplementary characters.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource
AttributeSource.State
-
-
Field Summary
Fields Modifier and Type Field Description static boolean
DEFAULT_PRESERVE_ORIGINAL
-
Fields inherited from class org.apache.lucene.analysis.TokenFilter
input
-
Fields inherited from class org.apache.lucene.analysis.TokenStream
DEFAULT_TOKEN_ATTRIBUTE_FACTORY
-
-
Constructor Summary
Constructors Constructor Description EdgeNGramTokenFilter(TokenStream input, int gramSize)
Creates an EdgeNGramTokenFilter that produces edge n-grams of the given size.EdgeNGramTokenFilter(TokenStream input, int minGram, int maxGram, boolean preserveOriginal)
Creates an EdgeNGramTokenFilter that, for a given input term, produces all edge n-grams with lengths >= minGram and <= maxGram.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
end()
boolean
incrementToken()
void
reset()
-
Methods inherited from class org.apache.lucene.analysis.TokenFilter
close
-
Methods inherited from class org.apache.lucene.util.AttributeSource
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toString
-
-
-
-
Field Detail
-
DEFAULT_PRESERVE_ORIGINAL
public static final boolean DEFAULT_PRESERVE_ORIGINAL
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
EdgeNGramTokenFilter
public EdgeNGramTokenFilter(TokenStream input, int minGram, int maxGram, boolean preserveOriginal)
Creates an EdgeNGramTokenFilter that, for a given input term, produces all edge n-grams with lengths >= minGram and <= maxGram. Will optionally preserve the original term when its length is outside of the defined range.- Parameters:
input
-TokenStream
holding the input to be tokenizedminGram
- the minimum length of the generated n-gramsmaxGram
- the maximum length of the generated n-gramspreserveOriginal
- Whether or not to keep the original term when it is outside the min/max size range.
-
EdgeNGramTokenFilter
public EdgeNGramTokenFilter(TokenStream input, int gramSize)
Creates an EdgeNGramTokenFilter that produces edge n-grams of the given size.- Parameters:
input
-TokenStream
holding the input to be tokenizedgramSize
- the n-gram size to generate.
-
-
Method Detail
-
incrementToken
public final boolean incrementToken() throws IOException
- Specified by:
incrementToken
in classTokenStream
- Throws:
IOException
-
reset
public void reset() throws IOException
- Overrides:
reset
in classTokenFilter
- Throws:
IOException
-
end
public void end() throws IOException
- Overrides:
end
in classTokenFilter
- Throws:
IOException
-
-