Class HunspellStemFilter
java.lang.Object
org.apache.lucene.util.AttributeSource
org.apache.lucene.analysis.TokenStream
org.apache.lucene.analysis.TokenFilter
org.apache.lucene.analysis.hunspell.HunspellStemFilter
- All Implemented Interfaces:
Closeable
,AutoCloseable
,Unwrappable<TokenStream>
TokenFilter that uses hunspell affix rules and words to stem tokens. Since hunspell supports a
word having multiple stems, this filter can emit multiple tokens for each consumed token
Note: This filter is aware of the KeywordAttribute
. To prevent certain terms from
being passed to the stemmer KeywordAttribute.isKeyword()
should be set to true
in a previous TokenStream
.
Note: For including the original term as well as the stemmed version, see KeywordRepeatFilterFactory
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource
AttributeSource.State
-
Field Summary
Fields inherited from class org.apache.lucene.analysis.TokenFilter
input
Fields inherited from class org.apache.lucene.analysis.TokenStream
DEFAULT_TOKEN_ATTRIBUTE_FACTORY
-
Constructor Summary
ConstructorDescriptionHunspellStemFilter
(TokenStream input, Dictionary dictionary) Create aHunspellStemFilter
outputting all possible stems.HunspellStemFilter
(TokenStream input, Dictionary dictionary, boolean dedup) Create aHunspellStemFilter
outputting all possible stems.HunspellStemFilter
(TokenStream input, Dictionary dictionary, boolean dedup, boolean longestOnly) Creates a new HunspellStemFilter that will stem tokens from the given TokenStream using affix rules in the provided Dictionary -
Method Summary
Methods inherited from class org.apache.lucene.analysis.TokenFilter
close, end, unwrap
Methods inherited from class org.apache.lucene.util.AttributeSource
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toString
-
Constructor Details
-
HunspellStemFilter
Create aHunspellStemFilter
outputting all possible stems. -
HunspellStemFilter
Create aHunspellStemFilter
outputting all possible stems. -
HunspellStemFilter
public HunspellStemFilter(TokenStream input, Dictionary dictionary, boolean dedup, boolean longestOnly) Creates a new HunspellStemFilter that will stem tokens from the given TokenStream using affix rules in the provided Dictionary- Parameters:
input
- TokenStream whose tokens will be stemmeddictionary
- HunspellDictionary containing the affix rules and words that will be used to stem the tokenslongestOnly
- true if only the longest term should be output.
-
-
Method Details
-
incrementToken
- Specified by:
incrementToken
in classTokenStream
- Throws:
IOException
-
reset
- Overrides:
reset
in classTokenFilter
- Throws:
IOException
-