|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.lucene.util.AttributeSource
org.apache.lucene.analysis.TokenStream
org.apache.lucene.analysis.TokenFilter
org.apache.lucene.analysis.util.FilteringTokenFilter
org.apache.lucene.analysis.core.StopFilter
public final class StopFilter
Removes stop words from a token stream.
You must specify the required Version
compatibility when creating StopFilter:
Nested Class Summary |
---|
Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource |
---|
AttributeSource.AttributeFactory, AttributeSource.State |
Field Summary |
---|
Fields inherited from class org.apache.lucene.analysis.TokenFilter |
---|
input |
Constructor Summary | |
---|---|
StopFilter(Version matchVersion,
TokenStream in,
CharArraySet stopWords)
Constructs a filter which removes words from the input TokenStream that are named in the Set. |
Method Summary | |
---|---|
protected boolean |
accept()
Returns the next input Token whose term() is not a stop word. |
static CharArraySet |
makeStopSet(Version matchVersion,
List<?> stopWords)
Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor. |
static CharArraySet |
makeStopSet(Version matchVersion,
List<?> stopWords,
boolean ignoreCase)
Creates a stopword set from the given stopword list. |
static CharArraySet |
makeStopSet(Version matchVersion,
String... stopWords)
Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor. |
static CharArraySet |
makeStopSet(Version matchVersion,
String[] stopWords,
boolean ignoreCase)
Creates a stopword set from the given stopword array. |
Methods inherited from class org.apache.lucene.analysis.util.FilteringTokenFilter |
---|
getEnablePositionIncrements, incrementToken, reset, setEnablePositionIncrements |
Methods inherited from class org.apache.lucene.analysis.TokenFilter |
---|
close, end |
Methods inherited from class org.apache.lucene.util.AttributeSource |
---|
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState |
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public StopFilter(Version matchVersion, TokenStream in, CharArraySet stopWords)
matchVersion
- Lucene version to enable correct Unicode 4.0 behavior in the stop
set if Version > 3.0. See above for details.in
- Input streamstopWords
- A CharArraySet
representing the stopwords.makeStopSet(Version, java.lang.String...)
Method Detail |
---|
public static CharArraySet makeStopSet(Version matchVersion, String... stopWords)
matchVersion
- Lucene version to enable correct Unicode 4.0 behavior in the returned set if Version > 3.0stopWords
- An array of stopwordspassing false to ignoreCase
public static CharArraySet makeStopSet(Version matchVersion, List<?> stopWords)
matchVersion
- Lucene version to enable correct Unicode 4.0 behavior in the returned set if Version > 3.0stopWords
- A List of Strings or char[] or any other toString()-able list representing the stopwords
CharArraySet
) containing the wordspassing false to ignoreCase
public static CharArraySet makeStopSet(Version matchVersion, String[] stopWords, boolean ignoreCase)
matchVersion
- Lucene version to enable correct Unicode 4.0 behavior in the returned set if Version > 3.0stopWords
- An array of stopwordsignoreCase
- If true, all words are lower cased first.
public static CharArraySet makeStopSet(Version matchVersion, List<?> stopWords, boolean ignoreCase)
matchVersion
- Lucene version to enable correct Unicode 4.0 behavior in the returned set if Version > 3.0stopWords
- A List of Strings or char[] or any other toString()-able list representing the stopwordsignoreCase
- if true, all words are lower cased first
CharArraySet
) containing the wordsprotected boolean accept()
accept
in class FilteringTokenFilter
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |