|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.lucene.util.AttributeSource org.apache.lucene.analysis.TokenStream org.apache.lucene.analysis.TokenFilter org.apache.lucene.analysis.StopFilter
public final class StopFilter
Removes stop words from a token stream.
Nested Class Summary |
---|
Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource |
---|
AttributeSource.AttributeFactory, AttributeSource.State |
Field Summary |
---|
Fields inherited from class org.apache.lucene.analysis.TokenFilter |
---|
input |
Constructor Summary | |
---|---|
StopFilter(boolean enablePositionIncrements,
TokenStream in,
Set stopWords)
Constructs a filter which removes words from the input TokenStream that are named in the Set. |
|
StopFilter(boolean enablePositionIncrements,
TokenStream input,
Set stopWords,
boolean ignoreCase)
Construct a token stream filtering the given input. |
|
StopFilter(boolean enablePositionIncrements,
TokenStream input,
String[] stopWords)
Deprecated. Use StopFilter(boolean, TokenStream, Set) instead. |
|
StopFilter(boolean enablePositionIncrements,
TokenStream in,
String[] stopWords,
boolean ignoreCase)
Deprecated. Use StopFilter(boolean, TokenStream, Set, boolean) instead. |
|
StopFilter(TokenStream in,
Set stopWords)
Deprecated. Use StopFilter(boolean, TokenStream, Set) instead |
|
StopFilter(TokenStream input,
Set stopWords,
boolean ignoreCase)
Deprecated. Use StopFilter(boolean, TokenStream, Set, boolean) instead |
|
StopFilter(TokenStream input,
String[] stopWords)
Deprecated. Use StopFilter(boolean, TokenStream, String[]) instead |
|
StopFilter(TokenStream in,
String[] stopWords,
boolean ignoreCase)
Deprecated. Use StopFilter(boolean, TokenStream, String[], boolean) instead |
Method Summary | |
---|---|
boolean |
getEnablePositionIncrements()
|
static boolean |
getEnablePositionIncrementsDefault()
Deprecated. Please specify this when you create the StopFilter |
static boolean |
getEnablePositionIncrementsVersionDefault(Version matchVersion)
Returns version-dependent default for enablePositionIncrements. |
boolean |
incrementToken()
Returns the next input Token whose term() is not a stop word. |
void |
init()
|
static Set |
makeStopSet(List stopWords)
Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor. |
static Set |
makeStopSet(List stopWords,
boolean ignoreCase)
|
static Set |
makeStopSet(String[] stopWords)
Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor. |
static Set |
makeStopSet(String[] stopWords,
boolean ignoreCase)
|
void |
setEnablePositionIncrements(boolean enable)
If true , this StopFilter will preserve
positions of the incoming tokens (ie, accumulate and
set position increments of the removed stop tokens). |
static void |
setEnablePositionIncrementsDefault(boolean defaultValue)
Deprecated. Please specify this when you create the StopFilter |
Methods inherited from class org.apache.lucene.analysis.TokenFilter |
---|
close, end, reset |
Methods inherited from class org.apache.lucene.analysis.TokenStream |
---|
getOnlyUseNewAPI, next, next, setOnlyUseNewAPI |
Methods inherited from class org.apache.lucene.util.AttributeSource |
---|
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, restoreState, toString |
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
public StopFilter(TokenStream input, String[] stopWords)
StopFilter(boolean, TokenStream, String[])
instead
public StopFilter(boolean enablePositionIncrements, TokenStream input, String[] stopWords)
StopFilter(boolean, TokenStream, Set)
instead.
enablePositionIncrements
- true if token positions should record the removed stop wordsinput
- input TokenStreamstopWords
- array of stop wordspublic StopFilter(TokenStream in, String[] stopWords, boolean ignoreCase)
StopFilter(boolean, TokenStream, String[], boolean)
instead
public StopFilter(boolean enablePositionIncrements, TokenStream in, String[] stopWords, boolean ignoreCase)
StopFilter(boolean, TokenStream, Set, boolean)
instead.
enablePositionIncrements
- true if token positions should record the removed stop wordsin
- input TokenStreamstopWords
- array of stop wordsignoreCase
- true if case is ignoredpublic StopFilter(TokenStream input, Set stopWords, boolean ignoreCase)
StopFilter(boolean, TokenStream, Set, boolean)
instead
stopWords
is an instance of CharArraySet
(true if
makeStopSet()
was used to construct the set) it will be directly used
and ignoreCase
will be ignored since CharArraySet
directly controls case sensitivity.
If stopWords
is not an instance of CharArraySet
,
a new CharArraySet will be constructed and ignoreCase
will be
used to specify the case sensitivity of that set.
input
- stopWords
- The set of Stop Words.ignoreCase
- -Ignore case when stopping.public StopFilter(boolean enablePositionIncrements, TokenStream input, Set stopWords, boolean ignoreCase)
stopWords
is an instance of CharArraySet
(true if
makeStopSet()
was used to construct the set) it will be directly used
and ignoreCase
will be ignored since CharArraySet
directly controls case sensitivity.
If stopWords
is not an instance of CharArraySet
,
a new CharArraySet will be constructed and ignoreCase
will be
used to specify the case sensitivity of that set.
enablePositionIncrements
- true if token positions should record the removed stop wordsinput
- Input TokenStreamstopWords
- The set of Stop Words.ignoreCase
- -Ignore case when stopping.public StopFilter(TokenStream in, Set stopWords)
StopFilter(boolean, TokenStream, Set)
instead
makeStopSet(java.lang.String[])
public StopFilter(boolean enablePositionIncrements, TokenStream in, Set stopWords)
enablePositionIncrements
- true if token positions should record the removed stop wordsin
- Input streamstopWords
- The set of Stop Words.makeStopSet(java.lang.String[])
Method Detail |
---|
public void init()
public static final Set makeStopSet(String[] stopWords)
passing false to ignoreCase
public static final Set makeStopSet(List stopWords)
passing false to ignoreCase
public static final Set makeStopSet(String[] stopWords, boolean ignoreCase)
stopWords
- An array of stopwordsignoreCase
- If true, all words are lower cased first.
public static final Set makeStopSet(List stopWords, boolean ignoreCase)
stopWords
- A List of Strings representing the stopwordsignoreCase
- if true, all words are lower cased first
public final boolean incrementToken() throws IOException
incrementToken
in class TokenStream
Note that this method will be defined abstract in Lucene 3.0.
IOException
public static boolean getEnablePositionIncrementsDefault()
setEnablePositionIncrementsDefault(boolean).
public static boolean getEnablePositionIncrementsVersionDefault(Version matchVersion)
getEnablePositionIncrementsDefault()
.
On 2.9 or later, it returns true.
public static void setEnablePositionIncrementsDefault(boolean defaultValue)
Note: behavior of a single StopFilter instance can be modified
with setEnablePositionIncrements(boolean)
.
This static method allows control over behavior of classes using StopFilters internally,
for example StandardAnalyzer
if used with the no-arg ctor.
Default : false.
setEnablePositionIncrements(boolean).
public boolean getEnablePositionIncrements()
setEnablePositionIncrements(boolean).
public void setEnablePositionIncrements(boolean enable)
true
, this StopFilter will preserve
positions of the incoming tokens (ie, accumulate and
set position increments of the removed stop tokens).
Generally, true
is best as it does not
lose information (positions of the original tokens)
during indexing.
When set, when a token is stopped (omitted), the position increment of the following token is incremented.
NOTE: be sure to also
set QueryParser.setEnablePositionIncrements(boolean)
if
you use QueryParser to create queries.
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |