Package org.apache.lucene.analysis
Class StopFilter
- java.lang.Object
-
- org.apache.lucene.util.AttributeSource
-
- org.apache.lucene.analysis.TokenStream
-
- org.apache.lucene.analysis.TokenFilter
-
- org.apache.lucene.analysis.FilteringTokenFilter
-
- org.apache.lucene.analysis.StopFilter
-
- All Implemented Interfaces:
Closeable
,AutoCloseable
,Unwrappable<TokenStream>
public class StopFilter extends FilteringTokenFilter
Removes stop words from a token stream.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource
AttributeSource.State
-
-
Field Summary
-
Fields inherited from class org.apache.lucene.analysis.TokenFilter
input
-
Fields inherited from class org.apache.lucene.analysis.TokenStream
DEFAULT_TOKEN_ATTRIBUTE_FACTORY
-
-
Constructor Summary
Constructors Constructor Description StopFilter(TokenStream in, CharArraySet stopWords)
Constructs a filter which removes words from the input TokenStream that are named in the Set.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description protected boolean
accept()
Returns the next input Token whose term() is not a stop word.static CharArraySet
makeStopSet(String... stopWords)
Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor.static CharArraySet
makeStopSet(String[] stopWords, boolean ignoreCase)
Creates a stopword set from the given stopword array.static CharArraySet
makeStopSet(List<?> stopWords)
Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor.static CharArraySet
makeStopSet(List<?> stopWords, boolean ignoreCase)
Creates a stopword set from the given stopword list.-
Methods inherited from class org.apache.lucene.analysis.FilteringTokenFilter
end, incrementToken, reset
-
Methods inherited from class org.apache.lucene.analysis.TokenFilter
close, unwrap
-
Methods inherited from class org.apache.lucene.util.AttributeSource
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toString
-
-
-
-
Constructor Detail
-
StopFilter
public StopFilter(TokenStream in, CharArraySet stopWords)
Constructs a filter which removes words from the input TokenStream that are named in the Set.- Parameters:
in
- Input streamstopWords
- ACharArraySet
representing the stopwords.- See Also:
makeStopSet(java.lang.String...)
-
-
Method Detail
-
makeStopSet
public static CharArraySet makeStopSet(String... stopWords)
Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor. This permits this stopWords construction to be cached once when an Analyzer is constructed.- Parameters:
stopWords
- An array of stopwords- See Also:
passing false to ignoreCase
-
makeStopSet
public static CharArraySet makeStopSet(List<?> stopWords)
Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor. This permits this stopWords construction to be cached once when an Analyzer is constructed.- Parameters:
stopWords
- A List of Strings or char[] or any other toString()-able list representing the stopwords- Returns:
- A Set (
CharArraySet
) containing the words - See Also:
passing false to ignoreCase
-
makeStopSet
public static CharArraySet makeStopSet(String[] stopWords, boolean ignoreCase)
Creates a stopword set from the given stopword array.- Parameters:
stopWords
- An array of stopwordsignoreCase
- If true, all words are lower cased first.- Returns:
- a Set containing the words
-
makeStopSet
public static CharArraySet makeStopSet(List<?> stopWords, boolean ignoreCase)
Creates a stopword set from the given stopword list.- Parameters:
stopWords
- A List of Strings or char[] or any other toString()-able list representing the stopwordsignoreCase
- if true, all words are lower cased first- Returns:
- A Set (
CharArraySet
) containing the words
-
accept
protected boolean accept()
Returns the next input Token whose term() is not a stop word.- Specified by:
accept
in classFilteringTokenFilter
-
-