StandardAnalyzer (Lucene 2.9.4 API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.lucene.analysis.standard
Class StandardAnalyzer

java.lang.Object
  org.apache.lucene.analysis.Analyzer
      org.apache.lucene.analysis.standard.StandardAnalyzer

public class StandardAnalyzer
extends Analyzer
extends Analyzer

Filters StandardTokenizer with StandardFilter, LowerCaseFilter and StopFilter, using a list of English stop words.

You must specify the required Version compatibility when creating StandardAnalyzer:

As of 2.9, StopFilter preserves position increments
As of 2.4, Tokens incorrectly identified as acronyms are corrected (see LUCENE-1608

Version:: $Id: StandardAnalyzer.java 829134 2009-10-23 17:18:53Z mikemccand $

Field Summary
`static int`	`DEFAULT_MAX_TOKEN_LENGTH` Default maximum allowed token length
`static String[]`	`STOP_WORDS` Deprecated. Use `STOP_WORDS_SET` instead
`static Set`	`STOP_WORDS_SET` An unmodifiable set containing some common English words that are usually not useful for searching.

Fields inherited from class org.apache.lucene.analysis.Analyzer
`overridesTokenStreamMethod`

Constructor Summary
`StandardAnalyzer()` Deprecated. Use `StandardAnalyzer(Version)` instead.
`StandardAnalyzer(boolean replaceInvalidAcronym)` Deprecated. Remove in 3.X and make true the only valid value
`StandardAnalyzer(File stopwords)` Deprecated. Use `StandardAnalyzer(Version, File)` instead
`StandardAnalyzer(File stopwords, boolean replaceInvalidAcronym)` Deprecated. Remove in 3.X and make true the only valid value
`StandardAnalyzer(Reader stopwords)` Deprecated. Use `StandardAnalyzer(Version, Reader)` instead
`StandardAnalyzer(Reader stopwords, boolean replaceInvalidAcronym)` Deprecated. Remove in 3.X and make true the only valid value
`StandardAnalyzer(Set stopWords)` Deprecated. Use `StandardAnalyzer(Version, Set)` instead
`StandardAnalyzer(Set stopwords, boolean replaceInvalidAcronym)` Deprecated. Remove in 3.X and make true the only valid value
`StandardAnalyzer(String[] stopWords)` Deprecated. Use `StandardAnalyzer(Version, Set)` instead
`StandardAnalyzer(String[] stopwords, boolean replaceInvalidAcronym)` Deprecated. Remove in 3.X and make true the only valid value
`StandardAnalyzer(Version matchVersion)` Builds an analyzer with the default stop words (`STOP_WORDS`).
`StandardAnalyzer(Version matchVersion, File stopwords)` Builds an analyzer with the stop words from the given file.
`StandardAnalyzer(Version matchVersion, Reader stopwords)` Builds an analyzer with the stop words from the given reader.
`StandardAnalyzer(Version matchVersion, Set stopWords)` Builds an analyzer with the given stop words.

Method Summary
`static boolean`	`getDefaultReplaceInvalidAcronym()` Deprecated. This will be removed (hardwired to true) in 3.0
`int`	`getMaxTokenLength()`
`boolean`	`isReplaceInvalidAcronym()` Deprecated. This will be removed (hardwired to true) in 3.0
`TokenStream`	`reusableTokenStream(String fieldName, Reader reader)` Deprecated. Use `tokenStream(java.lang.String, java.io.Reader)` instead
`static void`	`setDefaultReplaceInvalidAcronym(boolean replaceInvalidAcronym)` Deprecated. This will be removed (hardwired to true) in 3.0
`void`	`setMaxTokenLength(int length)` Set maximum allowed token length.
`void`	`setReplaceInvalidAcronym(boolean replaceInvalidAcronym)` Deprecated. This will be removed (hardwired to true) in 3.0
`TokenStream`	`tokenStream(String fieldName, Reader reader)` Constructs a `StandardTokenizer` filtered by a `StandardFilter`, a `LowerCaseFilter` and a `StopFilter`.

Methods inherited from class org.apache.lucene.analysis.Analyzer
`close, getOffsetGap, getPositionIncrementGap, getPreviousTokenStream, setOverridesTokenStreamMethod, setPreviousTokenStream`

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Field Detail

STOP_WORDS

public static final String[] STOP_WORDS

Deprecated. Use STOP_WORDS_SET instead

An array containing some common English words that are usually not useful for searching.

STOP_WORDS_SET

public static final Set STOP_WORDS_SET

An unmodifiable set containing some common English words that are usually not useful for searching.

DEFAULT_MAX_TOKEN_LENGTH

public static final int DEFAULT_MAX_TOKEN_LENGTH

Default maximum allowed token length

See Also:: Constant Field Values

Constructor Detail

StandardAnalyzer

public StandardAnalyzer()

Deprecated. Use StandardAnalyzer(Version) instead.

Builds an analyzer with the default stop words (STOP_WORDS_SET).

StandardAnalyzer

public StandardAnalyzer(Version matchVersion)

Builds an analyzer with the default stop words (STOP_WORDS).

Parameters:: matchVersion - Lucene version to match See above

StandardAnalyzer

public StandardAnalyzer(Set stopWords)

Deprecated. Use StandardAnalyzer(Version, Set) instead

Builds an analyzer with the given stop words.

StandardAnalyzer

public StandardAnalyzer(Version matchVersion,
                        Set stopWords)

Builds an analyzer with the given stop words.

Parameters:: matchVersion - Lucene version to match See above; stopWords - stop words

StandardAnalyzer

public StandardAnalyzer(String[] stopWords)

Deprecated. Use StandardAnalyzer(Version, Set) instead

Builds an analyzer with the given stop words.

StandardAnalyzer

public StandardAnalyzer(File stopwords)
                 throws IOException

Deprecated. Use StandardAnalyzer(Version, File) instead

Builds an analyzer with the stop words from the given file.

Throws:: IOException
See Also:: WordlistLoader.getWordSet(File)

StandardAnalyzer

public StandardAnalyzer(Version matchVersion,
                        File stopwords)
                 throws IOException

Builds an analyzer with the stop words from the given file.

Parameters:: matchVersion - Lucene version to match See above; stopwords - File to read stop words from
Throws:: IOException
See Also:: WordlistLoader.getWordSet(File)

StandardAnalyzer

public StandardAnalyzer(Reader stopwords)
                 throws IOException

Deprecated. Use StandardAnalyzer(Version, Reader) instead

Builds an analyzer with the stop words from the given reader.

Throws:: IOException
See Also:: WordlistLoader.getWordSet(Reader)

StandardAnalyzer

public StandardAnalyzer(Version matchVersion,
                        Reader stopwords)
                 throws IOException

Builds an analyzer with the stop words from the given reader.

Parameters:: matchVersion - Lucene version to match See above; stopwords - Reader to read stop words from
Throws:: IOException
See Also:: WordlistLoader.getWordSet(Reader)