CapitalizationFilter |
A filter to apply normal capitalization rules to Tokens.
|
CodepointCountFilter |
Removes words that are too long or too short from the stream.
|
ConcatenateGraphFilter.BytesRefBuilderTermAttribute |
Attribute providing access to the term builder and UTF-16 conversion
|
ConditionalTokenFilter |
Allows skipping TokenFilters based on the current set of attributes.
|
ConditionalTokenFilterFactory |
|
DelimitedTermFrequencyTokenFilter |
Characters before the delimiter are the "token", the textual integer after is the term frequency.
|
HyphenatedWordsFilter |
When the plain text is extracted from documents, we will often have many words hyphenated and
broken into two lines.
|
KeywordMarkerFilter |
|
LengthFilter |
Removes words that are too long or too short from the stream.
|
RemoveDuplicatesTokenFilter |
A TokenFilter which filters out Tokens at the same position and Term text as the previous token
in the stream.
|
ScandinavianNormalizationFilter |
This filter normalize use of the interchangeable Scandinavian characters æÆäÄöÖøØ and folded
variants (aa, ao, ae, oe and oo) by transforming them to åÅæÆøØ.
|
ScandinavianNormalizer.Foldings |
List of possible foldings that can be used when configuring the filter
|
StemmerOverrideFilter.StemmerOverrideMap |
A read-only 4-byte FST backed map that allows fast case-insensitive key value lookups for
StemmerOverrideFilter
|