StopFilter (Lucene 4.2.1 API)

Overview

Package

Class

Use

Tree

Deprecated

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.lucene.analysis.core
Class StopFilter

java.lang.Object
  org.apache.lucene.util.AttributeSource
      org.apache.lucene.analysis.TokenStream
          org.apache.lucene.analysis.TokenFilter
              org.apache.lucene.analysis.util.FilteringTokenFilter
                  org.apache.lucene.analysis.core.StopFilter

All Implemented Interfaces:: Closeable

public final class StopFilter
extends FilteringTokenFilter
extends FilteringTokenFilter

Removes stop words from a token stream.

You must specify the required Version compatibility when creating StopFilter:

As of 3.1, StopFilter correctly handles Unicode 4.0 supplementary characters in stopwords and position increments are preserved

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource
`AttributeSource.AttributeFactory, AttributeSource.State`

Field Summary

Fields inherited from class org.apache.lucene.analysis.TokenFilter
`input`

Constructor Summary
`StopFilter(Version matchVersion, TokenStream in, CharArraySet stopWords)` Constructs a filter which removes words from the input TokenStream that are named in the Set.

Method Summary
`protected boolean`	`accept()` Returns the next input Token whose term() is not a stop word.
`static CharArraySet`	`makeStopSet(Version matchVersion, List<?> stopWords)` Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor.
`static CharArraySet`	`makeStopSet(Version matchVersion, List<?> stopWords, boolean ignoreCase)` Creates a stopword set from the given stopword list.
`static CharArraySet`	`makeStopSet(Version matchVersion, String... stopWords)` Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor.
`static CharArraySet`	`makeStopSet(Version matchVersion, String[] stopWords, boolean ignoreCase)` Creates a stopword set from the given stopword array.

Methods inherited from class org.apache.lucene.analysis.util.FilteringTokenFilter
`getEnablePositionIncrements, incrementToken, reset, setEnablePositionIncrements`

Methods inherited from class org.apache.lucene.analysis.TokenFilter
`close, end`

Methods inherited from class org.apache.lucene.util.AttributeSource
`addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState`

Methods inherited from class java.lang.Object
`clone, finalize, getClass, notify, notifyAll, toString, wait, wait, wait`

Constructor Detail

StopFilter

public StopFilter(Version matchVersion,
                  TokenStream in,
                  CharArraySet stopWords)

Constructs a filter which removes words from the input TokenStream that are named in the Set.

Parameters:: matchVersion - Lucene version to enable correct Unicode 4.0 behavior in the stop set if Version > 3.0. See above for details.; in - Input stream; stopWords - A CharArraySet representing the stopwords.
See Also:: makeStopSet(Version, java.lang.String...)

Method Detail

makeStopSet

public static CharArraySet makeStopSet(Version matchVersion,
                                       String... stopWords)

Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor. This permits this stopWords construction to be cached once when an Analyzer is constructed.

Parameters:: matchVersion - Lucene version to enable correct Unicode 4.0 behavior in the returned set if Version > 3.0; stopWords - An array of stopwords
See Also:: passing false to ignoreCase

makeStopSet

public static CharArraySet makeStopSet(Version matchVersion,
                                       List<?> stopWords)

Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor. This permits this stopWords construction to be cached once when an Analyzer is constructed.

Parameters:: matchVersion - Lucene version to enable correct Unicode 4.0 behavior in the returned set if Version > 3.0; stopWords - A List of Strings or char[] or any other toString()-able list representing the stopwords
Returns:: A Set (CharArraySet) containing the words
See Also:: passing false to ignoreCase

makeStopSet

public static CharArraySet makeStopSet(Version matchVersion,
                                       String[] stopWords,
                                       boolean ignoreCase)

Creates a stopword set from the given stopword array.

Parameters:: matchVersion - Lucene version to enable correct Unicode 4.0 behavior in the returned set if Version > 3.0; stopWords - An array of stopwords; ignoreCase - If true, all words are lower cased first.
Returns:: a Set containing the words

makeStopSet

public static CharArraySet makeStopSet(Version matchVersion,
                                       List<?> stopWords,
                                       boolean ignoreCase)

Creates a stopword set from the given stopword list.

Parameters:: matchVersion - Lucene version to enable correct Unicode 4.0 behavior in the returned set if Version > 3.0; stopWords - A List of Strings or char[] or any other toString()-able list representing the stopwords; ignoreCase - if true, all words are lower cased first
Returns:: A Set (CharArraySet) containing the words

accept

protected boolean accept()

Returns the next input Token whose term() is not a stop word.

Specified by:: accept in class FilteringTokenFilter

Overview

Package

Class

Use

Tree

Deprecated

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.lucene.analysis.core Class StopFilter

StopFilter

makeStopSet

makeStopSet

makeStopSet

makeStopSet

accept

org.apache.lucene.analysis.core
Class StopFilter