org.apache.lucene.analysis.hunspell
Class HunspellStemFilter
java.lang.Object
org.apache.lucene.util.AttributeSource
org.apache.lucene.analysis.TokenStream
org.apache.lucene.analysis.TokenFilter
org.apache.lucene.analysis.hunspell.HunspellStemFilter
- All Implemented Interfaces:
- Closeable
public final class HunspellStemFilter
- extends TokenFilter
TokenFilter that uses hunspell affix rules and words to stem tokens. Since hunspell supports a word having multiple
stems, this filter can emit multiple tokens for each consumed token
Note: This filter is aware of the KeywordAttribute
. To prevent
certain terms from being passed to the stemmer
KeywordAttribute.isKeyword()
should be set to true
in a previous TokenStream
.
Note: For including the original term as well as the stemmed version, see
KeywordRepeatFilterFactory
Methods inherited from class org.apache.lucene.util.AttributeSource |
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toString |
HunspellStemFilter
public HunspellStemFilter(TokenStream input,
HunspellDictionary dictionary)
- Create a
HunspellStemFilter
which deduplicates stems and has a maximum
recursion level of 2.
- See Also:
HunspellStemFilter(TokenStream, HunspellDictionary, int)
HunspellStemFilter
public HunspellStemFilter(TokenStream input,
HunspellDictionary dictionary,
int recursionCap)
- Creates a new HunspellStemFilter that will stem tokens from the given TokenStream using affix rules in the provided
HunspellDictionary
- Parameters:
input
- TokenStream whose tokens will be stemmeddictionary
- HunspellDictionary containing the affix rules and words that will be used to stem the tokensrecursionCap
- maximum level of recursion stemmer can go into, defaults to 2
HunspellStemFilter
public HunspellStemFilter(TokenStream input,
HunspellDictionary dictionary,
boolean dedup)
- Create a
HunspellStemFilter
which has a maximum recursion level of 2.
- See Also:
HunspellStemFilter(TokenStream, HunspellDictionary, boolean, int)
HunspellStemFilter
public HunspellStemFilter(TokenStream input,
HunspellDictionary dictionary,
boolean dedup,
int recursionCap)
- Creates a new HunspellStemFilter that will stem tokens from the given TokenStream using affix rules in the provided
HunspellDictionary
- Parameters:
input
- TokenStream whose tokens will be stemmeddictionary
- HunspellDictionary containing the affix rules and words that will be used to stem the tokensdedup
- true if only unique terms should be output.recursionCap
- maximum level of recursion stemmer can go into, defaults to 2
incrementToken
public boolean incrementToken()
throws IOException
-
- Specified by:
incrementToken
in class TokenStream
- Throws:
IOException
reset
public void reset()
throws IOException
-
- Overrides:
reset
in class TokenFilter
- Throws:
IOException
Copyright © 2000-2014 Apache Software Foundation. All Rights Reserved.