org.apache.lucene.analysis.AnalyzerWrapper

All Implemented Interfaces:: Closeable, AutoCloseable

Direct Known Subclasses:: DelegatingAnalyzerWrapper

public abstract class AnalyzerWrapper extends Analyzer

Extension to Analyzer suitable for Analyzers which wrap other Analyzers.

getWrappedAnalyzer(String) allows the Analyzer to wrap multiple Analyzers which are selected on a per field basis.

wrapComponents(String, Analyzer.TokenStreamComponents) allows the TokenStreamComponents of the wrapped Analyzer to then be wrapped (such as adding a new TokenFilter to form new TokenStreamComponents.

wrapReader(String, Reader) allows the Reader of the wrapped Analyzer to then be wrapped (such as adding a new CharFilter.

Important: If you do not want to wrap the TokenStream using wrapComponents(String, Analyzer.TokenStreamComponents) or the Reader using wrapReader(String, Reader) and just delegate to other analyzers (like by field name), use DelegatingAnalyzerWrapper as superclass!

Since:

4.0.0

See Also:

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.lucene.analysis.Analyzer
Analyzer.ReuseStrategy, Analyzer.TokenStreamComponents
Field Summary

Fields inherited from class org.apache.lucene.analysis.Analyzer
GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY
Constructor Summary

Constructors

Modifier

Constructor

Description

protected

AnalyzerWrapper(Analyzer.ReuseStrategy reuseStrategy)

Creates a new AnalyzerWrapper with the given reuse strategy.
Method Summary

Modifier and Type

Method

Description

protected final AttributeFactory

attributeFactory(String fieldName)

Return the AttributeFactory to be used for analysis and normalization on the given FieldName.

protected final Analyzer.TokenStreamComponents

createComponents(String fieldName)

Creates a new Analyzer.TokenStreamComponents instance for this analyzer.

int

getOffsetGap(String fieldName)

Just like Analyzer.getPositionIncrementGap(java.lang.String), except for Token offsets instead.

int

getPositionIncrementGap(String fieldName)

Invoked before indexing a IndexableField instance if terms have already been added to that field.

protected abstract Analyzer

getWrappedAnalyzer(String fieldName)

Retrieves the wrapped Analyzer appropriate for analyzing the field with the given name

final Reader

initReader(String fieldName, Reader reader)

Override this if you want to add a CharFilter chain.

protected final Reader

initReaderForNormalization(String fieldName, Reader reader)

Wrap the given Reader with CharFilters that make sense for normalization.

protected final TokenStream

normalize(String fieldName, TokenStream in)

Wrap the given TokenStream in order to apply normalization filters.

protected Analyzer.TokenStreamComponents

wrapComponents(String fieldName, Analyzer.TokenStreamComponents components)

Wraps / alters the given TokenStreamComponents, taken from the wrapped Analyzer, to form new components.

protected Reader

wrapReader(String fieldName, Reader reader)

Wraps / alters the given Reader.

protected Reader

wrapReaderForNormalization(String fieldName, Reader reader)

Wraps / alters the given Reader.

protected TokenStream

wrapTokenStreamForNormalization(String fieldName, TokenStream in)

Wraps / alters the given TokenStream for normalization purposes, taken from the wrapped Analyzer, to form new components.

Methods inherited from class org.apache.lucene.analysis.Analyzer
close, getReuseStrategy, normalize, tokenStream, tokenStream

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- AnalyzerWrapper
  
  protected AnalyzerWrapper(Analyzer.ReuseStrategy reuseStrategy)
  
  Creates a new AnalyzerWrapper with the given reuse strategy.
  If you want to wrap a single delegate Analyzer you can probably reuse its strategy when instantiating this subclass: super(delegate.getReuseStrategy());.
  If you choose different analyzers per field, use Analyzer.PER_FIELD_REUSE_STRATEGY.
  See Also:
  
  Analyzer.getReuseStrategy()
Method Details
- getWrappedAnalyzer
  
  protected abstract Analyzer getWrappedAnalyzer(String fieldName)
  
  Retrieves the wrapped Analyzer appropriate for analyzing the field with the given name
  
  Parameters:
  
  fieldName - Name of the field which is to be analyzed
  
  Returns:
  
  Analyzer for the field with the given name. Assumed to be non-null
- wrapComponents
  
  protected Analyzer.TokenStreamComponents wrapComponents(String fieldName, Analyzer.TokenStreamComponents components)
  
  Wraps / alters the given TokenStreamComponents, taken from the wrapped Analyzer, to form new components. It is through this method that new TokenFilters can be added by AnalyzerWrappers. By default, the given components are returned.
  
  Parameters:
  
  fieldName - Name of the field which is to be analyzed
  
  components - TokenStreamComponents taken from the wrapped Analyzer
  
  Returns:
  
  Wrapped / altered TokenStreamComponents.
- wrapTokenStreamForNormalization
  
  protected TokenStream wrapTokenStreamForNormalization(String fieldName, TokenStream in)
  
  Wraps / alters the given TokenStream for normalization purposes, taken from the wrapped Analyzer, to form new components. It is through this method that new TokenFilters can be added by AnalyzerWrappers. By default, the given token stream are returned.
  
  Parameters:
  
  fieldName - Name of the field which is to be analyzed
  
  in - TokenStream taken from the wrapped Analyzer
  
  Returns:
  
  Wrapped / altered TokenStreamComponents.
- wrapReader
  
  protected Reader wrapReader(String fieldName, Reader reader)
  
  Wraps / alters the given Reader. Through this method AnalyzerWrappers can implement initReader(String, Reader). By default, the given reader is returned.
  
  Parameters:
  
  fieldName - name of the field which is to be analyzed
  
  reader - the reader to wrap
  
  Returns:
  
  the wrapped reader
- wrapReaderForNormalization
  
  protected Reader wrapReaderForNormalization(String fieldName, Reader reader)
  
  Wraps / alters the given Reader. Through this method AnalyzerWrappers can implement initReaderForNormalization(String, Reader). By default, the given reader is returned.
  
  Parameters:
  
  fieldName - name of the field which is to be analyzed
  
  reader - the reader to wrap
  
  Returns:
  
  the wrapped reader
- createComponents
  
  protected final Analyzer.TokenStreamComponents createComponents(String fieldName)
  
  Description copied from class: Analyzer
  
  Creates a new Analyzer.TokenStreamComponents instance for this analyzer.
  
  Specified by:
  
  createComponents in class Analyzer
  
  Parameters:
  
  fieldName - the name of the fields content passed to the Analyzer.TokenStreamComponents sink as a reader
  
  Returns:
  
  the Analyzer.TokenStreamComponents for this analyzer.
- normalize
  
  protected final TokenStream normalize(String fieldName, TokenStream in)
  
  Description copied from class: Analyzer
  
  Wrap the given TokenStream in order to apply normalization filters. The default implementation returns the TokenStream as-is. This is used by Analyzer.normalize(String, String).
  
  Overrides:
  
  normalize in class Analyzer
- getPositionIncrementGap
  
  public int getPositionIncrementGap(String fieldName)
  
  Description copied from class: Analyzer
  
  Invoked before indexing a IndexableField instance if terms have already been added to that field. This allows custom analyzers to place an automatic position increment gap between IndexbleField instances using the same field name. The default value position increment gap is 0. With a 0 position increment gap and the typical default token position increment of 1, all terms in a field, including across IndexableField instances, are in successive positions, allowing exact PhraseQuery matches, for instance, across IndexableField instance boundaries.
  
  Overrides:
  
  getPositionIncrementGap in class Analyzer
  
  Parameters:
  
  fieldName - IndexableField name being indexed.
  
  Returns:
  
  position increment gap, added to the next token emitted from Analyzer.tokenStream(String,Reader). This value must be >= 0.
- getOffsetGap
  
  public int getOffsetGap(String fieldName)
  
  Description copied from class: Analyzer
  
  Just like Analyzer.getPositionIncrementGap(java.lang.String), except for Token offsets instead. By default this returns 1. This method is only called if the field produced at least one token for indexing.
  
  Overrides:
  
  getOffsetGap in class Analyzer
  
  Parameters:
  
  fieldName - the field just indexed
  
  Returns:
  
  offset gap, added to the next token emitted from Analyzer.tokenStream(String,Reader). This value must be >= 0.
- initReader
  
  public final Reader initReader(String fieldName, Reader reader)
  
  Description copied from class: Analyzer
  
  Override this if you want to add a CharFilter chain.
  The default implementation returns reader unchanged.
  
  Overrides:
  
  initReader in class Analyzer
  
  Parameters:
  
  fieldName - IndexableField name being indexed
  
  reader - original Reader
  
  Returns:
  
  reader, optionally decorated with CharFilter(s)
- initReaderForNormalization
  
  protected final Reader initReaderForNormalization(String fieldName, Reader reader)
  
  Description copied from class: Analyzer
  
  Wrap the given Reader with CharFilters that make sense for normalization. This is typically a subset of the CharFilters that are applied in Analyzer.initReader(String, Reader). This is used by Analyzer.normalize(String, String).
  
  Overrides:
  
  initReaderForNormalization in class Analyzer
- attributeFactory
  
  protected final AttributeFactory attributeFactory(String fieldName)
  
  Description copied from class: Analyzer
  
  Return the AttributeFactory to be used for analysis and normalization on the given FieldName. The default implementation returns TokenStream.DEFAULT_TOKEN_ATTRIBUTE_FACTORY.
  
  Overrides:
  
  attributeFactory in class Analyzer

Class AnalyzerWrapper

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.lucene.analysis.Analyzer

Field Summary

Fields inherited from class org.apache.lucene.analysis.Analyzer

Constructor Summary

Method Summary

Methods inherited from class org.apache.lucene.analysis.Analyzer

Methods inherited from class java.lang.Object

Constructor Details

AnalyzerWrapper

Method Details

getWrappedAnalyzer

wrapComponents

wrapTokenStreamForNormalization

wrapReader

wrapReaderForNormalization

createComponents

normalize

getPositionIncrementGap

getOffsetGap

initReader

initReaderForNormalization

attributeFactory