StopFilter (Lucene 2.9.4 API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.lucene.analysis
Class StopFilter

java.lang.Object
  org.apache.lucene.util.AttributeSource
      org.apache.lucene.analysis.TokenStream
          org.apache.lucene.analysis.TokenFilter
              org.apache.lucene.analysis.StopFilter

public final class StopFilter
extends TokenFilter
extends TokenFilter

Removes stop words from a token stream.

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource
`AttributeSource.AttributeFactory, AttributeSource.State`

Field Summary

Fields inherited from class org.apache.lucene.analysis.TokenFilter
`input`

Constructor Summary
`StopFilter(boolean enablePositionIncrements, TokenStream in, Set stopWords)` Constructs a filter which removes words from the input TokenStream that are named in the Set.
`StopFilter(boolean enablePositionIncrements, TokenStream input, Set stopWords, boolean ignoreCase)` Construct a token stream filtering the given input.
`StopFilter(boolean enablePositionIncrements, TokenStream input, String[] stopWords)` Deprecated. Use `StopFilter(boolean, TokenStream, Set)` instead.
`StopFilter(boolean enablePositionIncrements, TokenStream in, String[] stopWords, boolean ignoreCase)` Deprecated. Use `StopFilter(boolean, TokenStream, Set, boolean)` instead.
`StopFilter(TokenStream in, Set stopWords)` Deprecated. Use `StopFilter(boolean, TokenStream, Set)` instead
`StopFilter(TokenStream input, Set stopWords, boolean ignoreCase)` Deprecated. Use `StopFilter(boolean, TokenStream, Set, boolean)` instead
`StopFilter(TokenStream input, String[] stopWords)` Deprecated. Use `StopFilter(boolean, TokenStream, String[])` instead
`StopFilter(TokenStream in, String[] stopWords, boolean ignoreCase)` Deprecated. Use `StopFilter(boolean, TokenStream, String[], boolean)` instead

Method Summary
`boolean`	`getEnablePositionIncrements()`
`static boolean`	`getEnablePositionIncrementsDefault()` Deprecated. Please specify this when you create the StopFilter
`static boolean`	`getEnablePositionIncrementsVersionDefault(Version matchVersion)` Returns version-dependent default for enablePositionIncrements.
`boolean`	`incrementToken()` Returns the next input Token whose term() is not a stop word.
`void`	`init()`
`static Set`	`makeStopSet(List stopWords)` Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor.
`static Set`	`makeStopSet(List stopWords, boolean ignoreCase)`
`static Set`	`makeStopSet(String[] stopWords)` Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor.
`static Set`	`makeStopSet(String[] stopWords, boolean ignoreCase)`
`void`	`setEnablePositionIncrements(boolean enable)` If `true`, this StopFilter will preserve positions of the incoming tokens (ie, accumulate and set position increments of the removed stop tokens).
`static void`	`setEnablePositionIncrementsDefault(boolean defaultValue)` Deprecated. Please specify this when you create the StopFilter

Methods inherited from class org.apache.lucene.analysis.TokenFilter
`close, end, reset`

Methods inherited from class org.apache.lucene.analysis.TokenStream
`getOnlyUseNewAPI, next, next, setOnlyUseNewAPI`

Methods inherited from class org.apache.lucene.util.AttributeSource
`addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, restoreState, toString`

Methods inherited from class java.lang.Object
`clone, finalize, getClass, notify, notifyAll, wait, wait, wait`

Constructor Detail

StopFilter

public StopFilter(TokenStream input,
                  String[] stopWords)

Deprecated. Use StopFilter(boolean, TokenStream, String[]) instead

Construct a token stream filtering the given input.

StopFilter

public StopFilter(boolean enablePositionIncrements,
                  TokenStream input,
                  String[] stopWords)

Deprecated. Use StopFilter(boolean, TokenStream, Set) instead.

Construct a token stream filtering the given input.

Parameters:: enablePositionIncrements - true if token positions should record the removed stop words; input - input TokenStream; stopWords - array of stop words

StopFilter

public StopFilter(TokenStream in,
                  String[] stopWords,
                  boolean ignoreCase)

Deprecated. Use StopFilter(boolean, TokenStream, String[], boolean) instead

Constructs a filter which removes words from the input TokenStream that are named in the array of words.

StopFilter

public StopFilter(boolean enablePositionIncrements,
                  TokenStream in,
                  String[] stopWords,
                  boolean ignoreCase)

Deprecated. Use StopFilter(boolean, TokenStream, Set, boolean) instead.

Constructs a filter which removes words from the input TokenStream that are named in the array of words.

Parameters:: enablePositionIncrements - true if token positions should record the removed stop words; in - input TokenStream; stopWords - array of stop words; ignoreCase - true if case is ignored

StopFilter

public StopFilter(TokenStream input,
                  Set stopWords,
                  boolean ignoreCase)

Deprecated. Use StopFilter(boolean, TokenStream, Set, boolean) instead

Construct a token stream filtering the given input. If stopWords is an instance of CharArraySet (true if makeStopSet() was used to construct the set) it will be directly used and ignoreCase will be ignored since CharArraySet directly controls case sensitivity.

If stopWords is not an instance of CharArraySet, a new CharArraySet will be constructed and ignoreCase will be used to specify the case sensitivity of that set.

Parameters:: input -; stopWords - The set of Stop Words.; ignoreCase - -Ignore case when stopping.

StopFilter

public StopFilter(boolean enablePositionIncrements,
                  TokenStream input,
                  Set stopWords,
                  boolean ignoreCase)

If stopWords is not an instance of CharArraySet, a new CharArraySet will be constructed and ignoreCase will be used to specify the case sensitivity of that set.

Parameters:: enablePositionIncrements - true if token positions should record the removed stop words; input - Input TokenStream; stopWords - The set of Stop Words.; ignoreCase - -Ignore case when stopping.

StopFilter

public StopFilter(TokenStream in,
                  Set stopWords)

Deprecated. Use StopFilter(boolean, TokenStream, Set) instead

Constructs a filter which removes words from the input TokenStream that are named in the Set.

See Also:: makeStopSet(java.lang.String[])

StopFilter

public StopFilter(boolean enablePositionIncrements,
                  TokenStream in,
                  Set stopWords)

Constructs a filter which removes words from the input TokenStream that are named in the Set.

Parameters:: enablePositionIncrements - true if token positions should record the removed stop words; in - Input stream; stopWords - The set of Stop Words.
See Also:: makeStopSet(java.lang.String[])

Method Detail

init

public void init()

makeStopSet

public static final Set makeStopSet(String[] stopWords)

Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor. This permits this stopWords construction to be cached once when an Analyzer is constructed.

See Also:: passing false to ignoreCase

makeStopSet

public static final Set makeStopSet(List stopWords)

Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor. This permits this stopWords construction to be cached once when an Analyzer is constructed.

See Also:: passing false to ignoreCase

makeStopSet

public static final Set makeStopSet(String[] stopWords,
                                    boolean ignoreCase)

Parameters:: stopWords - An array of stopwords; ignoreCase - If true, all words are lower cased first.
Returns:: a Set containing the words

makeStopSet

public static final Set makeStopSet(List stopWords,
                                    boolean ignoreCase)

Parameters:: stopWords - A List of Strings representing the stopwords; ignoreCase - if true, all words are lower cased first
Returns:: A Set containing the words

incrementToken

public final boolean incrementToken()
                             throws IOException

Returns the next input Token whose term() is not a stop word.

Overrides:: incrementToken in class TokenStream

Returns:: false for end of stream; true otherwise
Note that this method will be defined abstract in Lucene 3.0.
Throws:: IOException

getEnablePositionIncrementsDefault

public static boolean getEnablePositionIncrementsDefault()

Deprecated. Please specify this when you create the StopFilter

See Also:: setEnablePositionIncrementsDefault(boolean).

getEnablePositionIncrementsVersionDefault

public static boolean getEnablePositionIncrementsVersionDefault(Version matchVersion)

Returns version-dependent default for enablePositionIncrements. Analyzers that embed StopFilter use this method when creating the StopFilter. Prior to 2.9, this returns getEnablePositionIncrementsDefault(). On 2.9 or later, it returns true.

setEnablePositionIncrementsDefault

public static void setEnablePositionIncrementsDefault(boolean defaultValue)

Deprecated. Please specify this when you create the StopFilter

Set the default position increments behavior of every StopFilter created from now on.

Note: behavior of a single StopFilter instance can be modified with setEnablePositionIncrements(boolean). This static method allows control over behavior of classes using StopFilters internally, for example StandardAnalyzer if used with the no-arg ctor.

Default : false.

See Also:: setEnablePositionIncrements(boolean).

getEnablePositionIncrements

public boolean getEnablePositionIncrements()

See Also:: setEnablePositionIncrements(boolean).

setEnablePositionIncrements

public void setEnablePositionIncrements(boolean enable)

If true, this StopFilter will preserve positions of the incoming tokens (ie, accumulate and set position increments of the removed stop tokens). Generally, true is best as it does not lose information (positions of the original tokens) during indexing.

When set, when a token is stopped (omitted), the position increment of the following token is incremented.

NOTE: be sure to also set QueryParser.setEnablePositionIncrements(boolean) if you use QueryParser to create queries.

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.lucene.analysis Class StopFilter

StopFilter

StopFilter

StopFilter

StopFilter

StopFilter

StopFilter

StopFilter

StopFilter

init

makeStopSet

makeStopSet

makeStopSet

makeStopSet

incrementToken

getEnablePositionIncrementsDefault

getEnablePositionIncrementsVersionDefault

setEnablePositionIncrementsDefault

getEnablePositionIncrements

setEnablePositionIncrements

org.apache.lucene.analysis
Class StopFilter