StandardAnalyzer (Lucene 8.6.2 API)

java.lang.Object
- org.apache.lucene.analysis.Analyzer
- - org.apache.lucene.analysis.StopwordAnalyzerBase
  - - org.apache.lucene.analysis.standard.StandardAnalyzer

All Implemented Interfaces:

Closeable, AutoCloseable
```
public final class StandardAnalyzer
extends StopwordAnalyzerBase
```
Filters StandardTokenizer with LowerCaseFilter and StopFilter, using a configurable list of stop words.

Since:

3.1

Nested Class Summary
- Nested classes/interfaces inherited from class org.apache.lucene.analysis.Analyzer
  Analyzer.ReuseStrategy, Analyzer.TokenStreamComponents

Field Summary

Fields
Modifier and Type Field and Description

static int DEFAULT_MAX_TOKEN_LENGTH
Default maximum allowed token length
- Fields inherited from class org.apache.lucene.analysis.StopwordAnalyzerBase
  stopwords
- Fields inherited from class org.apache.lucene.analysis.Analyzer
  GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY

Fields
Modifier and Type	Field and Description
`static int`	`DEFAULT_MAX_TOKEN_LENGTH` Default maximum allowed token length

Constructor Summary

Constructors
Constructor and Description
`StandardAnalyzer()` Builds an analyzer with no stop words.
`StandardAnalyzer(CharArraySet stopWords)` Builds an analyzer with the given stop words.
`StandardAnalyzer(Reader stopwords)` Builds an analyzer with the stop words from the given reader.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`protected Analyzer.TokenStreamComponents`	`createComponents(String fieldName)` Creates a new `Analyzer.TokenStreamComponents` instance for this analyzer.
`int`	`getMaxTokenLength()` Returns the current maximum token length
`protected TokenStream`	`normalize(String fieldName, TokenStream in)` Wrap the given `TokenStream` in order to apply normalization filters.
`void`	`setMaxTokenLength(int length)` Set the max allowed token length.

Methods inherited from class org.apache.lucene.analysis.StopwordAnalyzerBase
getStopwordSet, loadStopwordSet, loadStopwordSet, loadStopwordSet

Methods inherited from class org.apache.lucene.analysis.Analyzer
attributeFactory, close, getOffsetGap, getPositionIncrementGap, getReuseStrategy, getVersion, initReader, initReaderForNormalization, normalize, setVersion, tokenStream, tokenStream

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - DEFAULT_MAX_TOKEN_LENGTH
```
public static final int DEFAULT_MAX_TOKEN_LENGTH
```
    Default maximum allowed token length
    
    See Also:
    
    Constant Field Values
- Constructor Detail
  - StandardAnalyzer
```
public StandardAnalyzer(CharArraySet stopWords)
```
    Builds an analyzer with the given stop words.
    
    Parameters:
    
    stopWords - stop words
  - StandardAnalyzer
```
public StandardAnalyzer()
```
    Builds an analyzer with no stop words.
  - StandardAnalyzer
```
public StandardAnalyzer(Reader stopwords)
                 throws IOException
```
    Builds an analyzer with the stop words from the given reader.
    
    Parameters:
    
    stopwords - Reader to read stop words from
    
    Throws:
    
    IOException
    
    See Also:
    
    WordlistLoader.getWordSet(Reader)
- Method Detail
  - setMaxTokenLength
```
public void setMaxTokenLength(int length)
```
    Set the max allowed token length. Tokens larger than this will be chopped up at this token length and emitted as multiple tokens. If you need to skip such large tokens, you could increase this max length, and then use LengthFilter to remove long tokens. The default is DEFAULT_MAX_TOKEN_LENGTH.
  - getMaxTokenLength
```
public int getMaxTokenLength()
```
    Returns the current maximum token length
    
    See Also:
    
    setMaxTokenLength(int)
  - createComponents
```
protected Analyzer.TokenStreamComponents createComponents(String fieldName)
```
    Description copied from class: Analyzer
    
    Creates a new Analyzer.TokenStreamComponents instance for this analyzer.
    
    Specified by:
    
    createComponents in class Analyzer
    
    Parameters:
    
    fieldName - the name of the fields content passed to the Analyzer.TokenStreamComponents sink as a reader
    
    Returns:
    
    the Analyzer.TokenStreamComponents for this analyzer.
  - normalize
```
protected TokenStream normalize(String fieldName,
                                TokenStream in)
```
    Description copied from class: Analyzer
    
    Wrap the given TokenStream in order to apply normalization filters. The default implementation returns the TokenStream as-is. This is used by Analyzer.normalize(String, String).
    
    Overrides:
    
    normalize in class Analyzer

Class StandardAnalyzer

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.lucene.analysis.Analyzer

Field Summary

Fields inherited from class org.apache.lucene.analysis.StopwordAnalyzerBase

Fields inherited from class org.apache.lucene.analysis.Analyzer

Constructor Summary

Method Summary

Methods inherited from class org.apache.lucene.analysis.StopwordAnalyzerBase

Methods inherited from class org.apache.lucene.analysis.Analyzer

Methods inherited from class java.lang.Object

Field Detail

DEFAULT_MAX_TOKEN_LENGTH

Constructor Detail

StandardAnalyzer

StandardAnalyzer

StandardAnalyzer

Method Detail

setMaxTokenLength

getMaxTokenLength

createComponents

normalize