public abstract class AnalyzerWrapper extends Analyzer
Analyzer
suitable for Analyzers which wrap
other Analyzers.
getWrappedAnalyzer(String)
allows the Analyzer
to wrap multiple Analyzers which are selected on a per field basis.
wrapComponents(String, Analyzer.TokenStreamComponents)
allows the
TokenStreamComponents of the wrapped Analyzer to then be wrapped
(such as adding a new TokenFilter
to form new TokenStreamComponents.
wrapReader(String, Reader)
allows the Reader of the wrapped
Analyzer to then be wrapped (such as adding a new CharFilter
.
Important: If you do not want to wrap the TokenStream
using wrapComponents(String, Analyzer.TokenStreamComponents)
or the Reader using wrapReader(String, Reader)
and just delegate
to other analyzers (like by field name), use DelegatingAnalyzerWrapper
as superclass!
DelegatingAnalyzerWrapper
Analyzer.ReuseStrategy, Analyzer.TokenStreamComponents
GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY
Modifier | Constructor and Description |
---|---|
protected |
AnalyzerWrapper(Analyzer.ReuseStrategy reuseStrategy)
Creates a new AnalyzerWrapper with the given reuse strategy.
|
Modifier and Type | Method and Description |
---|---|
protected AttributeFactory |
attributeFactory(String fieldName)
|
protected Analyzer.TokenStreamComponents |
createComponents(String fieldName)
Creates a new
Analyzer.TokenStreamComponents instance for this analyzer. |
int |
getOffsetGap(String fieldName)
Just like
Analyzer.getPositionIncrementGap(java.lang.String) , except for
Token offsets instead. |
int |
getPositionIncrementGap(String fieldName)
Invoked before indexing a IndexableField instance if
terms have already been added to that field.
|
protected abstract Analyzer |
getWrappedAnalyzer(String fieldName)
Retrieves the wrapped Analyzer appropriate for analyzing the field with
the given name
|
Reader |
initReader(String fieldName,
Reader reader)
Override this if you want to add a CharFilter chain.
|
protected Reader |
initReaderForNormalization(String fieldName,
Reader reader)
Wrap the given
Reader with CharFilter s that make sense
for normalization. |
protected TokenStream |
normalize(String fieldName,
TokenStream in)
Wrap the given
TokenStream in order to apply normalization filters. |
protected Analyzer.TokenStreamComponents |
wrapComponents(String fieldName,
Analyzer.TokenStreamComponents components)
Wraps / alters the given TokenStreamComponents, taken from the wrapped
Analyzer, to form new components.
|
protected Reader |
wrapReader(String fieldName,
Reader reader)
Wraps / alters the given Reader.
|
protected Reader |
wrapReaderForNormalization(String fieldName,
Reader reader)
Wraps / alters the given Reader.
|
protected TokenStream |
wrapTokenStreamForNormalization(String fieldName,
TokenStream in)
Wraps / alters the given TokenStream for normalization purposes, taken
from the wrapped Analyzer, to form new components.
|
close, getReuseStrategy, getVersion, normalize, setVersion, tokenStream, tokenStream
protected AnalyzerWrapper(Analyzer.ReuseStrategy reuseStrategy)
If you want to wrap a single delegate Analyzer you can probably
reuse its strategy when instantiating this subclass:
super(delegate.getReuseStrategy());
.
If you choose different analyzers per field, use
Analyzer.PER_FIELD_REUSE_STRATEGY
.
Analyzer.getReuseStrategy()
protected abstract Analyzer getWrappedAnalyzer(String fieldName)
fieldName
- Name of the field which is to be analyzedprotected Analyzer.TokenStreamComponents wrapComponents(String fieldName, Analyzer.TokenStreamComponents components)
fieldName
- Name of the field which is to be analyzedcomponents
- TokenStreamComponents taken from the wrapped Analyzerprotected TokenStream wrapTokenStreamForNormalization(String fieldName, TokenStream in)
fieldName
- Name of the field which is to be analyzedin
- TokenStream taken from the wrapped Analyzerprotected Reader wrapReader(String fieldName, Reader reader)
initReader(String, Reader)
. By default, the given reader
is returned.fieldName
- name of the field which is to be analyzedreader
- the reader to wrapprotected Reader wrapReaderForNormalization(String fieldName, Reader reader)
initReaderForNormalization(String, Reader)
. By default,
the given reader is returned.fieldName
- name of the field which is to be analyzedreader
- the reader to wrapprotected final Analyzer.TokenStreamComponents createComponents(String fieldName)
Analyzer
Analyzer.TokenStreamComponents
instance for this analyzer.createComponents
in class Analyzer
fieldName
- the name of the fields content passed to the
Analyzer.TokenStreamComponents
sink as a readerAnalyzer.TokenStreamComponents
for this analyzer.protected final TokenStream normalize(String fieldName, TokenStream in)
Analyzer
TokenStream
in order to apply normalization filters.
The default implementation returns the TokenStream
as-is. This is
used by Analyzer.normalize(String, String)
.public int getPositionIncrementGap(String fieldName)
Analyzer
getPositionIncrementGap
in class Analyzer
fieldName
- IndexableField name being indexed.Analyzer.tokenStream(String,Reader)
.
This value must be >= 0
.public int getOffsetGap(String fieldName)
Analyzer
Analyzer.getPositionIncrementGap(java.lang.String)
, except for
Token offsets instead. By default this returns 1.
This method is only called if the field
produced at least one token for indexing.getOffsetGap
in class Analyzer
fieldName
- the field just indexedAnalyzer.tokenStream(String,Reader)
.
This value must be >= 0
.public final Reader initReader(String fieldName, Reader reader)
Analyzer
The default implementation returns reader
unchanged.
initReader
in class Analyzer
fieldName
- IndexableField name being indexedreader
- original Readerprotected final Reader initReaderForNormalization(String fieldName, Reader reader)
Analyzer
Reader
with CharFilter
s that make sense
for normalization. This is typically a subset of the CharFilter
s
that are applied in Analyzer.initReader(String, Reader)
. This is used by
Analyzer.normalize(String, String)
.initReaderForNormalization
in class Analyzer
protected final AttributeFactory attributeFactory(String fieldName)
Analyzer
AttributeFactory
to be used for
analysis
and
normalization
on the given
FieldName
. The default implementation returns
TokenStream.DEFAULT_TOKEN_ATTRIBUTE_FACTORY
.attributeFactory
in class Analyzer
Copyright © 2000-2019 Apache Software Foundation. All Rights Reserved.