| Interface | Description |
|---|---|
| ConcatenateGraphFilter.BytesRefBuilderTermAttribute |
Attribute providing access to the term builder and UTF-16 conversion
|
| Class | Description |
|---|---|
| ASCIIFoldingFilter |
This class converts alphabetic, numeric, and symbolic Unicode characters
which are not in the first 127 ASCII characters (the "Basic Latin" Unicode
block) into their ASCII equivalents, if one exists.
|
| ASCIIFoldingFilterFactory |
Factory for
ASCIIFoldingFilter. |
| CapitalizationFilter |
A filter to apply normal capitalization rules to Tokens.
|
| CapitalizationFilterFactory |
Factory for
CapitalizationFilter. |
| CodepointCountFilter |
Removes words that are too long or too short from the stream.
|
| CodepointCountFilterFactory |
Factory for
CodepointCountFilter. |
| ConcatenateGraphFilter |
Concatenates/Joins every incoming token with a separator into one output token for every path through the
token stream (which is a graph).
|
| ConcatenateGraphFilter.BytesRefBuilderTermAttributeImpl |
Implementation of
ConcatenateGraphFilter.BytesRefBuilderTermAttribute |
| ConcatenateGraphFilterFactory |
Factory for
ConcatenateGraphFilter. |
| ConcatenatingTokenStream |
A TokenStream that takes an array of input TokenStreams as sources, and
concatenates them together.
|
| ConditionalTokenFilter |
Allows skipping TokenFilters based on the current set of attributes.
|
| ConditionalTokenFilterFactory |
Abstract parent class for analysis factories that create
ConditionalTokenFilter instances |
| DateRecognizerFilter |
Filters all tokens that cannot be parsed to a date, using the provided
DateFormat. |
| DateRecognizerFilterFactory |
Factory for
DateRecognizerFilter. |
| DelimitedTermFrequencyTokenFilter |
Characters before the delimiter are the "token", the textual integer after is the term frequency.
|
| DelimitedTermFrequencyTokenFilterFactory |
Factory for
DelimitedTermFrequencyTokenFilter. |
| EmptyTokenStream |
An always exhausted token stream.
|
| FingerprintFilter |
Filter outputs a single token which is a concatenation of the sorted and
de-duplicated set of input tokens.
|
| FingerprintFilterFactory |
Factory for
FingerprintFilter. |
| FixBrokenOffsetsFilter | Deprecated
Fix the token filters that create broken offsets in the first place.
|
| FixBrokenOffsetsFilterFactory |
Factory for
FixBrokenOffsetsFilter. |
| HyphenatedWordsFilter |
When the plain text is extracted from documents, we will often have many words hyphenated and broken into
two lines.
|
| HyphenatedWordsFilterFactory |
Factory for
HyphenatedWordsFilter. |
| KeepWordFilter |
A TokenFilter that only keeps tokens with text contained in the
required words.
|
| KeepWordFilterFactory |
Factory for
KeepWordFilter. |
| KeywordMarkerFilter |
Marks terms as keywords via the
KeywordAttribute. |
| KeywordMarkerFilterFactory |
Factory for
KeywordMarkerFilter. |
| KeywordRepeatFilter |
This TokenFilter emits each incoming token twice once as keyword and once non-keyword, in other words once with
KeywordAttribute.setKeyword(boolean) set to true and once set to false. |
| KeywordRepeatFilterFactory |
Factory for
KeywordRepeatFilter. |
| LengthFilter |
Removes words that are too long or too short from the stream.
|
| LengthFilterFactory |
Factory for
LengthFilter. |
| LimitTokenCountAnalyzer |
This Analyzer limits the number of tokens while indexing.
|
| LimitTokenCountFilter |
This TokenFilter limits the number of tokens while indexing.
|
| LimitTokenCountFilterFactory |
Factory for
LimitTokenCountFilter. |
| LimitTokenOffsetFilter |
Lets all tokens pass through until it sees one with a start offset <= a
configured limit, which won't pass and ends the stream.
|
| LimitTokenOffsetFilterFactory |
Factory for
LimitTokenOffsetFilter. |
| LimitTokenPositionFilter |
This TokenFilter limits its emitted tokens to those with positions that
are not greater than the configured limit.
|
| LimitTokenPositionFilterFactory |
Factory for
LimitTokenPositionFilter. |
| PatternKeywordMarkerFilter |
Marks terms as keywords via the
KeywordAttribute. |
| PerFieldAnalyzerWrapper |
This analyzer is used to facilitate scenarios where different
fields require different analysis techniques.
|
| ProtectedTermFilter |
A ConditionalTokenFilter that only applies its wrapped filters to tokens that
are not contained in a protected set.
|
| ProtectedTermFilterFactory |
Factory for a
ProtectedTermFilter |
| RemoveDuplicatesTokenFilter |
A TokenFilter which filters out Tokens at the same position and Term text as the previous token in the stream.
|
| RemoveDuplicatesTokenFilterFactory |
Factory for
RemoveDuplicatesTokenFilter. |
| ScandinavianFoldingFilter |
This filter folds Scandinavian characters åÅäæÄÆ->a and öÖøØ->o.
|
| ScandinavianFoldingFilterFactory |
Factory for
ScandinavianFoldingFilter. |
| ScandinavianNormalizationFilter |
This filter normalize use of the interchangeable Scandinavian characters æÆäÄöÖøØ
and folded variants (aa, ao, ae, oe and oo) by transforming them to åÅæÆøØ.
|
| ScandinavianNormalizationFilterFactory |
Factory for
ScandinavianNormalizationFilter. |
| SetKeywordMarkerFilter |
Marks terms as keywords via the
KeywordAttribute. |
| StemmerOverrideFilter |
Provides the ability to override any
KeywordAttribute aware stemmer
with custom dictionary-based stemming. |
| StemmerOverrideFilter.Builder |
This builder builds an
FST for the StemmerOverrideFilter |
| StemmerOverrideFilter.StemmerOverrideMap |
A read-only 4-byte FST backed map that allows fast case-insensitive key
value lookups for
StemmerOverrideFilter |
| StemmerOverrideFilterFactory |
Factory for
StemmerOverrideFilter. |
| TrimFilter |
Trims leading and trailing whitespace from Tokens in the stream.
|
| TrimFilterFactory |
Factory for
TrimFilter. |
| TruncateTokenFilter |
A token filter for truncating the terms into a specific length.
|
| TruncateTokenFilterFactory |
Factory for
TruncateTokenFilter. |
| TypeAsSynonymFilter |
Adds the
TypeAttribute.type() as a synonym,
i.e. |
| TypeAsSynonymFilterFactory |
Factory for
TypeAsSynonymFilter. |
| WordDelimiterFilter | Deprecated
Use
WordDelimiterGraphFilter instead: it produces a correct
token graph so that e.g. |
| WordDelimiterFilterFactory | Deprecated
Use
WordDelimiterGraphFilterFactory instead: it produces a correct
token graph so that e.g. |
| WordDelimiterGraphFilter |
Splits words into subwords and performs optional transformations on subword
groups, producing a correct token graph so that e.g.
|
| WordDelimiterGraphFilterFactory |
Factory for
WordDelimiterGraphFilter. |
| WordDelimiterIterator |
A BreakIterator-like API for iterating over subwords in text, according to WordDelimiterGraphFilter rules.
|
Copyright © 2000-2021 Apache Software Foundation. All Rights Reserved.