Package | Description |
---|---|
org.apache.lucene.analysis.custom |
A general-purpose Analyzer that can be created with a builder-style API.
|
org.apache.lucene.analysis.miscellaneous |
Miscellaneous Tokenstreams.
|
Class and Description |
---|
ConditionalTokenFilterFactory
Abstract parent class for analysis factories that create
ConditionalTokenFilter instances |
Class and Description |
---|
CapitalizationFilter
A filter to apply normal capitalization rules to Tokens.
|
CodepointCountFilter
Removes words that are too long or too short from the stream.
|
ConcatenateGraphFilter.BytesRefBuilderTermAttribute
Attribute providing access to the term builder and UTF-16 conversion
|
ConditionalTokenFilter
Allows skipping TokenFilters based on the current set of attributes.
|
ConditionalTokenFilterFactory
Abstract parent class for analysis factories that create
ConditionalTokenFilter instances |
DelimitedTermFrequencyTokenFilter
Characters before the delimiter are the "token", the textual integer after is the term frequency.
|
HyphenatedWordsFilter
When the plain text is extracted from documents, we will often have many words hyphenated and broken into
two lines.
|
KeywordMarkerFilter
Marks terms as keywords via the
KeywordAttribute . |
LengthFilter
Removes words that are too long or too short from the stream.
|
RemoveDuplicatesTokenFilter
A TokenFilter which filters out Tokens at the same position and Term text as the previous token in the stream.
|
ScandinavianNormalizationFilter
This filter normalize use of the interchangeable Scandinavian characters æÆäÄöÖøØ
and folded variants (aa, ao, ae, oe and oo) by transforming them to åÅæÆøØ.
|
StemmerOverrideFilter.StemmerOverrideMap
A read-only 4-byte FST backed map that allows fast case-insensitive key
value lookups for
StemmerOverrideFilter |
Copyright © 2000-2020 Apache Software Foundation. All Rights Reserved.