public final class StopFilter extends FilteringTokenFilter
AttributeSource.State
input
DEFAULT_TOKEN_ATTRIBUTE_FACTORY
Constructor and Description |
---|
StopFilter(TokenStream in,
CharArraySet stopWords)
Constructs a filter which removes words from the input TokenStream that are
named in the Set.
|
Modifier and Type | Method and Description |
---|---|
protected boolean |
accept()
Returns the next input Token whose term() is not a stop word.
|
static CharArraySet |
makeStopSet(List<?> stopWords)
Builds a Set from an array of stop words,
appropriate for passing into the StopFilter constructor.
|
static CharArraySet |
makeStopSet(List<?> stopWords,
boolean ignoreCase)
Creates a stopword set from the given stopword list.
|
static CharArraySet |
makeStopSet(String... stopWords)
Builds a Set from an array of stop words,
appropriate for passing into the StopFilter constructor.
|
static CharArraySet |
makeStopSet(String[] stopWords,
boolean ignoreCase)
Creates a stopword set from the given stopword array.
|
end, incrementToken, reset
close
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toString
public StopFilter(TokenStream in, CharArraySet stopWords)
in
- Input streamstopWords
- A CharArraySet
representing the stopwords.makeStopSet(java.lang.String...)
public static CharArraySet makeStopSet(String... stopWords)
stopWords
- An array of stopwordspassing false to ignoreCase
public static CharArraySet makeStopSet(List<?> stopWords)
stopWords
- A List of Strings or char[] or any other toString()-able list representing the stopwordsCharArraySet
) containing the wordspassing false to ignoreCase
public static CharArraySet makeStopSet(String[] stopWords, boolean ignoreCase)
stopWords
- An array of stopwordsignoreCase
- If true, all words are lower cased first.public static CharArraySet makeStopSet(List<?> stopWords, boolean ignoreCase)
stopWords
- A List of Strings or char[] or any other toString()-able list representing the stopwordsignoreCase
- if true, all words are lower cased firstCharArraySet
) containing the wordsprotected boolean accept()
accept
in class FilteringTokenFilter
Copyright © 2000-2015 Apache Software Foundation. All Rights Reserved.