Class | Description |
---|---|
ASCIIFoldingFilter |
This class converts alphabetic, numeric, and symbolic Unicode characters
which are not in the first 127 ASCII characters (the "Basic Latin" Unicode
block) into their ASCII equivalents, if one exists.
|
ASCIIFoldingFilterFactory |
Factory for
ASCIIFoldingFilter . |
CapitalizationFilter |
A filter to apply normal capitalization rules to Tokens.
|
CapitalizationFilterFactory |
Factory for
CapitalizationFilter . |
CodepointCountFilter |
Removes words that are too long or too short from the stream.
|
CodepointCountFilterFactory |
Factory for
CodepointCountFilter . |
DateRecognizerFilter |
Filters all tokens that cannot be parsed to a date, using the provided
DateFormat . |
DateRecognizerFilterFactory |
Factory for
DateRecognizerFilter . |
EmptyTokenStream |
An always exhausted token stream.
|
FingerprintFilter |
Filter outputs a single token which is a concatenation of the sorted and
de-duplicated set of input tokens.
|
FingerprintFilterFactory |
Factory for
FingerprintFilter . |
HyphenatedWordsFilter |
When the plain text is extracted from documents, we will often have many words hyphenated and broken into
two lines.
|
HyphenatedWordsFilterFactory |
Factory for
HyphenatedWordsFilter . |
KeepWordFilter |
A TokenFilter that only keeps tokens with text contained in the
required words.
|
KeepWordFilterFactory |
Factory for
KeepWordFilter . |
KeywordMarkerFilter |
Marks terms as keywords via the
KeywordAttribute . |
KeywordMarkerFilterFactory |
Factory for
KeywordMarkerFilter . |
KeywordRepeatFilter |
This TokenFilter emits each incoming token twice once as keyword and once non-keyword, in other words once with
KeywordAttribute.setKeyword(boolean) set to true and once set to false . |
KeywordRepeatFilterFactory |
Factory for
KeywordRepeatFilter . |
LengthFilter |
Removes words that are too long or too short from the stream.
|
LengthFilterFactory |
Factory for
LengthFilter . |
LimitTokenCountAnalyzer |
This Analyzer limits the number of tokens while indexing.
|
LimitTokenCountFilter |
This TokenFilter limits the number of tokens while indexing.
|
LimitTokenCountFilterFactory |
Factory for
LimitTokenCountFilter . |
LimitTokenOffsetFilter |
Lets all tokens pass through until it sees one with a start offset <= a
configured limit, which won't pass and ends the stream.
|
LimitTokenOffsetFilterFactory |
Factory for
LimitTokenOffsetFilter . |
LimitTokenPositionFilter |
This TokenFilter limits its emitted tokens to those with positions that
are not greater than the configured limit.
|
LimitTokenPositionFilterFactory |
Factory for
LimitTokenPositionFilter . |
PatternKeywordMarkerFilter |
Marks terms as keywords via the
KeywordAttribute . |
PerFieldAnalyzerWrapper |
This analyzer is used to facilitate scenarios where different
fields require different analysis techniques.
|
PrefixAndSuffixAwareTokenFilter |
Links two
PrefixAwareTokenFilter . |
PrefixAwareTokenFilter |
Joins two token streams and leaves the last token of the first stream available
to be used when updating the token values in the second stream based on that token.
|
RemoveDuplicatesTokenFilter |
A TokenFilter which filters out Tokens at the same position and Term text as the previous token in the stream.
|
RemoveDuplicatesTokenFilterFactory |
Factory for
RemoveDuplicatesTokenFilter . |
ScandinavianFoldingFilter |
This filter folds Scandinavian characters åÅäæÄÆ->a and öÖøØ->o.
|
ScandinavianFoldingFilterFactory |
Factory for
ScandinavianFoldingFilter . |
ScandinavianNormalizationFilter |
This filter normalize use of the interchangeable Scandinavian characters æÆäÄöÖøØ
and folded variants (aa, ao, ae, oe and oo) by transforming them to åÅæÆøØ.
|
ScandinavianNormalizationFilterFactory |
Factory for
ScandinavianNormalizationFilter . |
SetKeywordMarkerFilter |
Marks terms as keywords via the
KeywordAttribute . |
StemmerOverrideFilter |
Provides the ability to override any
KeywordAttribute aware stemmer
with custom dictionary-based stemming. |
StemmerOverrideFilter.Builder |
This builder builds an
FST for the StemmerOverrideFilter |
StemmerOverrideFilter.StemmerOverrideMap |
A read-only 4-byte FST backed map that allows fast case-insensitive key
value lookups for
StemmerOverrideFilter |
StemmerOverrideFilterFactory |
Factory for
StemmerOverrideFilter . |
TrimFilter |
Trims leading and trailing whitespace from Tokens in the stream.
|
TrimFilterFactory |
Factory for
TrimFilter . |
TruncateTokenFilter |
A token filter for truncating the terms into a specific length.
|
TruncateTokenFilterFactory |
Factory for
TruncateTokenFilter . |
WordDelimiterFilter |
Splits words into subwords and performs optional transformations on subword
groups.
|
WordDelimiterFilterFactory |
Factory for
WordDelimiterFilter . |
WordDelimiterIterator |
A BreakIterator-like API for iterating over subwords in text, according to WordDelimiterFilter rules.
|
Copyright © 2000-2017 Apache Software Foundation. All Rights Reserved.