|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.lucene.util.AttributeSource org.apache.lucene.analysis.TokenStream org.apache.lucene.analysis.TokenFilter org.apache.lucene.analysis.FilteringTokenFilter org.apache.lucene.analysis.StopFilter
public final class StopFilter
Removes stop words from a token stream.
You must specify the required Version
compatibility when creating StopFilter:
Nested Class Summary |
---|
Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource |
---|
AttributeSource.AttributeFactory, AttributeSource.State |
Field Summary |
---|
Fields inherited from class org.apache.lucene.analysis.TokenFilter |
---|
input |
Constructor Summary | |
---|---|
StopFilter(boolean enablePositionIncrements,
TokenStream in,
Set<?> stopWords)
Deprecated. use StopFilter(Version, TokenStream, Set) instead |
|
StopFilter(boolean enablePositionIncrements,
TokenStream input,
Set<?> stopWords,
boolean ignoreCase)
Deprecated. use StopFilter(Version, TokenStream, Set, boolean) instead |
|
StopFilter(Version matchVersion,
TokenStream in,
Set<?> stopWords)
Constructs a filter which removes words from the input TokenStream that are named in the Set. |
|
StopFilter(Version matchVersion,
TokenStream input,
Set<?> stopWords,
boolean ignoreCase)
Construct a token stream filtering the given input. |
Method Summary | |
---|---|
protected boolean |
accept()
Returns the next input Token whose term() is not a stop word. |
static boolean |
getEnablePositionIncrementsVersionDefault(Version matchVersion)
Deprecated. use StopFilter(Version, TokenStream, Set) instead |
static Set<Object> |
makeStopSet(List<?> stopWords)
Deprecated. use makeStopSet(Version, List) instead |
static Set<Object> |
makeStopSet(List<?> stopWords,
boolean ignoreCase)
Deprecated. use makeStopSet(Version, List, boolean) instead |
static Set<Object> |
makeStopSet(String... stopWords)
Deprecated. use makeStopSet(Version, String...) instead |
static Set<Object> |
makeStopSet(String[] stopWords,
boolean ignoreCase)
Deprecated. use makeStopSet(Version, String[], boolean) instead; |
static Set<Object> |
makeStopSet(Version matchVersion,
List<?> stopWords)
Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor. |
static Set<Object> |
makeStopSet(Version matchVersion,
List<?> stopWords,
boolean ignoreCase)
Creates a stopword set from the given stopword list. |
static Set<Object> |
makeStopSet(Version matchVersion,
String... stopWords)
Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor. |
static Set<Object> |
makeStopSet(Version matchVersion,
String[] stopWords,
boolean ignoreCase)
Creates a stopword set from the given stopword array. |
Methods inherited from class org.apache.lucene.analysis.FilteringTokenFilter |
---|
getEnablePositionIncrements, incrementToken, setEnablePositionIncrements |
Methods inherited from class org.apache.lucene.analysis.TokenFilter |
---|
close, end, reset |
Methods inherited from class org.apache.lucene.util.AttributeSource |
---|
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toString |
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
@Deprecated public StopFilter(boolean enablePositionIncrements, TokenStream input, Set<?> stopWords, boolean ignoreCase)
StopFilter(Version, TokenStream, Set, boolean)
instead
stopWords
is an instance of CharArraySet
(true if
makeStopSet()
was used to construct the set) it will be directly used
and ignoreCase
will be ignored since CharArraySet
directly controls case sensitivity.
If stopWords
is not an instance of CharArraySet
,
a new CharArraySet will be constructed and ignoreCase
will be
used to specify the case sensitivity of that set.
enablePositionIncrements
- true if token positions should record the removed stop wordsinput
- Input TokenStreamstopWords
- A Set of Strings or char[] or any other toString()-able set representing the stopwordsignoreCase
- if true, all words are lower cased firstpublic StopFilter(Version matchVersion, TokenStream input, Set<?> stopWords, boolean ignoreCase)
stopWords
is an instance of CharArraySet
(true if
makeStopSet()
was used to construct the set) it will be
directly used and ignoreCase
will be ignored since
CharArraySet
directly controls case sensitivity.
If stopWords
is not an instance of CharArraySet
, a new
CharArraySet will be constructed and ignoreCase
will be used
to specify the case sensitivity of that set.
matchVersion
- Lucene version to enable correct Unicode 4.0 behavior in the stop
set if Version > 3.0. See above for details.input
- Input TokenStreamstopWords
- A Set of Strings or char[] or any other toString()-able set
representing the stopwordsignoreCase
- if true, all words are lower cased first@Deprecated public StopFilter(boolean enablePositionIncrements, TokenStream in, Set<?> stopWords)
StopFilter(Version, TokenStream, Set)
instead
enablePositionIncrements
- true if token positions should record the removed stop wordsin
- Input streamstopWords
- A Set of Strings or char[] or any other toString()-able set representing the stopwordsmakeStopSet(Version, java.lang.String[])
public StopFilter(Version matchVersion, TokenStream in, Set<?> stopWords)
matchVersion
- Lucene version to enable correct Unicode 4.0 behavior in the stop
set if Version > 3.0. See above for details.in
- Input streamstopWords
- A Set of Strings or char[] or any other toString()-able set
representing the stopwordsmakeStopSet(Version, java.lang.String[])
Method Detail |
---|
@Deprecated public static final Set<Object> makeStopSet(String... stopWords)
makeStopSet(Version, String...)
instead
passing false to ignoreCase
public static final Set<Object> makeStopSet(Version matchVersion, String... stopWords)
matchVersion
- Lucene version to enable correct Unicode 4.0 behavior in the returned set if Version > 3.0stopWords
- An array of stopwordspassing false to ignoreCase
@Deprecated public static final Set<Object> makeStopSet(List<?> stopWords)
makeStopSet(Version, List)
instead
stopWords
- A List of Strings or char[] or any other toString()-able list representing the stopwords
CharArraySet
) containing the wordspassing false to ignoreCase
public static final Set<Object> makeStopSet(Version matchVersion, List<?> stopWords)
matchVersion
- Lucene version to enable correct Unicode 4.0 behavior in the returned set if Version > 3.0stopWords
- A List of Strings or char[] or any other toString()-able list representing the stopwords
CharArraySet
) containing the wordspassing false to ignoreCase
@Deprecated public static final Set<Object> makeStopSet(String[] stopWords, boolean ignoreCase)
makeStopSet(Version, String[], boolean)
instead;
stopWords
- An array of stopwordsignoreCase
- If true, all words are lower cased first.
public static final Set<Object> makeStopSet(Version matchVersion, String[] stopWords, boolean ignoreCase)
matchVersion
- Lucene version to enable correct Unicode 4.0 behavior in the returned set if Version > 3.0stopWords
- An array of stopwordsignoreCase
- If true, all words are lower cased first.
@Deprecated public static final Set<Object> makeStopSet(List<?> stopWords, boolean ignoreCase)
makeStopSet(Version, List, boolean)
instead
stopWords
- A List of Strings or char[] or any other toString()-able list representing the stopwordsignoreCase
- if true, all words are lower cased first
CharArraySet
) containing the wordspublic static final Set<Object> makeStopSet(Version matchVersion, List<?> stopWords, boolean ignoreCase)
matchVersion
- Lucene version to enable correct Unicode 4.0 behavior in the returned set if Version > 3.0stopWords
- A List of Strings or char[] or any other toString()-able list representing the stopwordsignoreCase
- if true, all words are lower cased first
CharArraySet
) containing the wordsprotected boolean accept() throws IOException
accept
in class FilteringTokenFilter
IOException
@Deprecated public static boolean getEnablePositionIncrementsVersionDefault(Version matchVersion)
StopFilter(Version, TokenStream, Set)
instead
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |