StopAnalyzer (Lucene 4.5.0 API)

Overview

Package

Class

Use

Tree

Deprecated

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.lucene.analysis.core
Class StopAnalyzer

java.lang.Object
  org.apache.lucene.analysis.Analyzer
      org.apache.lucene.analysis.util.StopwordAnalyzerBase
          org.apache.lucene.analysis.core.StopAnalyzer

All Implemented Interfaces:: Closeable

public final class StopAnalyzer
extends StopwordAnalyzerBase
extends StopwordAnalyzerBase

Filters LetterTokenizer with LowerCaseFilter and StopFilter.

You must specify the required Version compatibility when creating StopAnalyzer:

As of 3.1, StopFilter correctly handles Unicode 4.0 supplementary characters in stopwords
As of 2.9, position increments are preserved

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.lucene.analysis.Analyzer
`Analyzer.GlobalReuseStrategy, Analyzer.PerFieldReuseStrategy, Analyzer.ReuseStrategy, Analyzer.TokenStreamComponents`

Field Summary
`static CharArraySet`	`ENGLISH_STOP_WORDS_SET` An unmodifiable set containing some common English words that are not usually useful for searching.

Fields inherited from class org.apache.lucene.analysis.util.StopwordAnalyzerBase
`matchVersion, stopwords`

Fields inherited from class org.apache.lucene.analysis.Analyzer
`GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY`

Constructor Summary
`StopAnalyzer(Version matchVersion)` Builds an analyzer which removes words in `ENGLISH_STOP_WORDS_SET`.
`StopAnalyzer(Version matchVersion, CharArraySet stopWords)` Builds an analyzer with the stop words from the given set.
`StopAnalyzer(Version matchVersion, File stopwordsFile)` Builds an analyzer with the stop words from the given file.
`StopAnalyzer(Version matchVersion, Reader stopwords)` Builds an analyzer with the stop words from the given reader.

Method Summary
`protected Analyzer.TokenStreamComponents`	`createComponents(String fieldName, Reader reader)` Creates `Analyzer.TokenStreamComponents` used to tokenize all the text in the provided `Reader`.

Methods inherited from class org.apache.lucene.analysis.util.StopwordAnalyzerBase
`getStopwordSet, loadStopwordSet, loadStopwordSet, loadStopwordSet`

Methods inherited from class org.apache.lucene.analysis.Analyzer
`close, getOffsetGap, getPositionIncrementGap, getReuseStrategy, initReader, tokenStream, tokenStream`

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Field Detail

ENGLISH_STOP_WORDS_SET

public static final CharArraySet ENGLISH_STOP_WORDS_SET

An unmodifiable set containing some common English words that are not usually useful for searching.

Constructor Detail

StopAnalyzer

public StopAnalyzer(Version matchVersion)

Builds an analyzer which removes words in ENGLISH_STOP_WORDS_SET.

Parameters:: matchVersion - See above

StopAnalyzer

public StopAnalyzer(Version matchVersion,
                    CharArraySet stopWords)

Builds an analyzer with the stop words from the given set.

Parameters:: matchVersion - See above; stopWords - Set of stop words

StopAnalyzer

public StopAnalyzer(Version matchVersion,
                    File stopwordsFile)
             throws IOException

Builds an analyzer with the stop words from the given file.

Parameters:: matchVersion - See above; stopwordsFile - File to load stop words from
Throws:: IOException
See Also:: WordlistLoader.getWordSet(Reader, Version)

StopAnalyzer

public StopAnalyzer(Version matchVersion,
                    Reader stopwords)
             throws IOException

Builds an analyzer with the stop words from the given reader.

Parameters:: matchVersion - See above; stopwords - Reader to load stop words from
Throws:: IOException
See Also:: WordlistLoader.getWordSet(Reader, Version)

Method Detail

createComponents

protected Analyzer.TokenStreamComponents createComponents(String fieldName,
                                                          Reader reader)

Creates Analyzer.TokenStreamComponents used to tokenize all the text in the provided Reader.

Specified by:: createComponents in class Analyzer

Returns:: Analyzer.TokenStreamComponents built from a LowerCaseTokenizer filtered with StopFilter

Overview

Package

Class

Use

Tree

Deprecated

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.lucene.analysis.core Class StopAnalyzer

ENGLISH_STOP_WORDS_SET

StopAnalyzer

StopAnalyzer

StopAnalyzer

StopAnalyzer

createComponents

org.apache.lucene.analysis.core
Class StopAnalyzer