Class AnalyzerWrapper
- java.lang.Object
-
- org.apache.lucene.analysis.Analyzer
-
- org.apache.lucene.analysis.AnalyzerWrapper
-
- All Implemented Interfaces:
Closeable
,AutoCloseable
- Direct Known Subclasses:
DelegatingAnalyzerWrapper
public abstract class AnalyzerWrapper extends Analyzer
Extension toAnalyzer
suitable for Analyzers which wrap other Analyzers.getWrappedAnalyzer(String)
allows the Analyzer to wrap multiple Analyzers which are selected on a per field basis.wrapComponents(String, Analyzer.TokenStreamComponents)
allows the TokenStreamComponents of the wrapped Analyzer to then be wrapped (such as adding a newTokenFilter
to form new TokenStreamComponents.wrapReader(String, Reader)
allows the Reader of the wrapped Analyzer to then be wrapped (such as adding a newCharFilter
.Important: If you do not want to wrap the TokenStream using
wrapComponents(String, Analyzer.TokenStreamComponents)
or the Reader usingwrapReader(String, Reader)
and just delegate to other analyzers (like by field name), useDelegatingAnalyzerWrapper
as superclass!- Since:
- 4.0.0
- See Also:
DelegatingAnalyzerWrapper
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.lucene.analysis.Analyzer
Analyzer.ReuseStrategy, Analyzer.TokenStreamComponents
-
-
Field Summary
-
Fields inherited from class org.apache.lucene.analysis.Analyzer
GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY
-
-
Constructor Summary
Constructors Modifier Constructor Description protected
AnalyzerWrapper(Analyzer.ReuseStrategy reuseStrategy)
Creates a new AnalyzerWrapper with the given reuse strategy.
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description protected AttributeFactory
attributeFactory(String fieldName)
protected Analyzer.TokenStreamComponents
createComponents(String fieldName)
Creates a newAnalyzer.TokenStreamComponents
instance for this analyzer.int
getOffsetGap(String fieldName)
Just likeAnalyzer.getPositionIncrementGap(java.lang.String)
, except for Token offsets instead.int
getPositionIncrementGap(String fieldName)
Invoked before indexing a IndexableField instance if terms have already been added to that field.protected abstract Analyzer
getWrappedAnalyzer(String fieldName)
Retrieves the wrapped Analyzer appropriate for analyzing the field with the given nameReader
initReader(String fieldName, Reader reader)
Override this if you want to add a CharFilter chain.protected Reader
initReaderForNormalization(String fieldName, Reader reader)
Wrap the givenReader
withCharFilter
s that make sense for normalization.protected TokenStream
normalize(String fieldName, TokenStream in)
Wrap the givenTokenStream
in order to apply normalization filters.protected Analyzer.TokenStreamComponents
wrapComponents(String fieldName, Analyzer.TokenStreamComponents components)
Wraps / alters the given TokenStreamComponents, taken from the wrapped Analyzer, to form new components.protected Reader
wrapReader(String fieldName, Reader reader)
Wraps / alters the given Reader.protected Reader
wrapReaderForNormalization(String fieldName, Reader reader)
Wraps / alters the given Reader.protected TokenStream
wrapTokenStreamForNormalization(String fieldName, TokenStream in)
Wraps / alters the given TokenStream for normalization purposes, taken from the wrapped Analyzer, to form new components.-
Methods inherited from class org.apache.lucene.analysis.Analyzer
close, getReuseStrategy, normalize, tokenStream, tokenStream
-
-
-
-
Constructor Detail
-
AnalyzerWrapper
protected AnalyzerWrapper(Analyzer.ReuseStrategy reuseStrategy)
Creates a new AnalyzerWrapper with the given reuse strategy.If you want to wrap a single delegate Analyzer you can probably reuse its strategy when instantiating this subclass:
super(delegate.getReuseStrategy());
.If you choose different analyzers per field, use
Analyzer.PER_FIELD_REUSE_STRATEGY
.- See Also:
Analyzer.getReuseStrategy()
-
-
Method Detail
-
getWrappedAnalyzer
protected abstract Analyzer getWrappedAnalyzer(String fieldName)
Retrieves the wrapped Analyzer appropriate for analyzing the field with the given name- Parameters:
fieldName
- Name of the field which is to be analyzed- Returns:
- Analyzer for the field with the given name. Assumed to be non-null
-
wrapComponents
protected Analyzer.TokenStreamComponents wrapComponents(String fieldName, Analyzer.TokenStreamComponents components)
Wraps / alters the given TokenStreamComponents, taken from the wrapped Analyzer, to form new components. It is through this method that new TokenFilters can be added by AnalyzerWrappers. By default, the given components are returned.- Parameters:
fieldName
- Name of the field which is to be analyzedcomponents
- TokenStreamComponents taken from the wrapped Analyzer- Returns:
- Wrapped / altered TokenStreamComponents.
-
wrapTokenStreamForNormalization
protected TokenStream wrapTokenStreamForNormalization(String fieldName, TokenStream in)
Wraps / alters the given TokenStream for normalization purposes, taken from the wrapped Analyzer, to form new components. It is through this method that new TokenFilters can be added by AnalyzerWrappers. By default, the given token stream are returned.- Parameters:
fieldName
- Name of the field which is to be analyzedin
- TokenStream taken from the wrapped Analyzer- Returns:
- Wrapped / altered TokenStreamComponents.
-
wrapReader
protected Reader wrapReader(String fieldName, Reader reader)
Wraps / alters the given Reader. Through this method AnalyzerWrappers can implementinitReader(String, Reader)
. By default, the given reader is returned.- Parameters:
fieldName
- name of the field which is to be analyzedreader
- the reader to wrap- Returns:
- the wrapped reader
-
wrapReaderForNormalization
protected Reader wrapReaderForNormalization(String fieldName, Reader reader)
Wraps / alters the given Reader. Through this method AnalyzerWrappers can implementinitReaderForNormalization(String, Reader)
. By default, the given reader is returned.- Parameters:
fieldName
- name of the field which is to be analyzedreader
- the reader to wrap- Returns:
- the wrapped reader
-
createComponents
protected final Analyzer.TokenStreamComponents createComponents(String fieldName)
Description copied from class:Analyzer
Creates a newAnalyzer.TokenStreamComponents
instance for this analyzer.- Specified by:
createComponents
in classAnalyzer
- Parameters:
fieldName
- the name of the fields content passed to theAnalyzer.TokenStreamComponents
sink as a reader- Returns:
- the
Analyzer.TokenStreamComponents
for this analyzer.
-
normalize
protected final TokenStream normalize(String fieldName, TokenStream in)
Description copied from class:Analyzer
Wrap the givenTokenStream
in order to apply normalization filters. The default implementation returns theTokenStream
as-is. This is used byAnalyzer.normalize(String, String)
.
-
getPositionIncrementGap
public int getPositionIncrementGap(String fieldName)
Description copied from class:Analyzer
Invoked before indexing a IndexableField instance if terms have already been added to that field. This allows custom analyzers to place an automatic position increment gap between IndexbleField instances using the same field name. The default value position increment gap is 0. With a 0 position increment gap and the typical default token position increment of 1, all terms in a field, including across IndexableField instances, are in successive positions, allowing exact PhraseQuery matches, for instance, across IndexableField instance boundaries.- Overrides:
getPositionIncrementGap
in classAnalyzer
- Parameters:
fieldName
- IndexableField name being indexed.- Returns:
- position increment gap, added to the next token emitted from
Analyzer.tokenStream(String,Reader)
. This value must be>= 0
.
-
getOffsetGap
public int getOffsetGap(String fieldName)
Description copied from class:Analyzer
Just likeAnalyzer.getPositionIncrementGap(java.lang.String)
, except for Token offsets instead. By default this returns 1. This method is only called if the field produced at least one token for indexing.- Overrides:
getOffsetGap
in classAnalyzer
- Parameters:
fieldName
- the field just indexed- Returns:
- offset gap, added to the next token emitted from
Analyzer.tokenStream(String,Reader)
. This value must be>= 0
.
-
initReader
public final Reader initReader(String fieldName, Reader reader)
Description copied from class:Analyzer
Override this if you want to add a CharFilter chain.The default implementation returns
reader
unchanged.- Overrides:
initReader
in classAnalyzer
- Parameters:
fieldName
- IndexableField name being indexedreader
- original Reader- Returns:
- reader, optionally decorated with CharFilter(s)
-
initReaderForNormalization
protected final Reader initReaderForNormalization(String fieldName, Reader reader)
Description copied from class:Analyzer
Wrap the givenReader
withCharFilter
s that make sense for normalization. This is typically a subset of theCharFilter
s that are applied inAnalyzer.initReader(String, Reader)
. This is used byAnalyzer.normalize(String, String)
.- Overrides:
initReaderForNormalization
in classAnalyzer
-
attributeFactory
protected final AttributeFactory attributeFactory(String fieldName)
Description copied from class:Analyzer
Return theAttributeFactory
to be used foranalysis
andnormalization
on the givenFieldName
. The default implementation returnsTokenStream.DEFAULT_TOKEN_ATTRIBUTE_FACTORY
.- Overrides:
attributeFactory
in classAnalyzer
-
-