org.apache.lucene.analysis.miscellaneous (Lucene 4.7.2 API)

Overview

Package

Class

Use

Tree

Deprecated

Help

PREV PACKAGE NEXT PACKAGE

FRAMES NO FRAMES

Package org.apache.lucene.analysis.miscellaneous

Miscellaneous TokenStreams

See:
Description

Class Summary
ASCIIFoldingFilter	This class converts alphabetic, numeric, and symbolic Unicode characters which are not in the first 127 ASCII characters (the "Basic Latin" Unicode block) into their ASCII equivalents, if one exists.
ASCIIFoldingFilterFactory	Factory for `ASCIIFoldingFilter`.
CapitalizationFilter	A filter to apply normal capitalization rules to Tokens.
CapitalizationFilterFactory	Factory for `CapitalizationFilter`.
CodepointCountFilter	Removes words that are too long or too short from the stream.
CodepointCountFilterFactory	Factory for `CodepointCountFilter`.
EmptyTokenStream	An always exhausted token stream.
HyphenatedWordsFilter	When the plain text is extracted from documents, we will often have many words hyphenated and broken into two lines.
HyphenatedWordsFilterFactory	Factory for `HyphenatedWordsFilter`.
KeepWordFilter	A TokenFilter that only keeps tokens with text contained in the required words.
KeepWordFilterFactory	Factory for `KeepWordFilter`.
KeywordMarkerFilter	Marks terms as keywords via the `KeywordAttribute`.
KeywordMarkerFilterFactory	Factory for `KeywordMarkerFilter`.
KeywordRepeatFilter	This TokenFilter emits each incoming token twice once as keyword and once non-keyword, in other words once with `KeywordAttribute.setKeyword(boolean)` set to `true` and once set to `false`.
KeywordRepeatFilterFactory	Factory for `KeywordRepeatFilter`.
LengthFilter	Removes words that are too long or too short from the stream.
LengthFilterFactory	Factory for `LengthFilter`.
LimitTokenCountAnalyzer	This Analyzer limits the number of tokens while indexing.
LimitTokenCountFilter	This TokenFilter limits the number of tokens while indexing.
LimitTokenCountFilterFactory	Factory for `LimitTokenCountFilter`.
LimitTokenPositionFilter	This TokenFilter limits its emitted tokens to those with positions that are not greater than the configured limit.
LimitTokenPositionFilterFactory	Factory for `LimitTokenPositionFilter`.
PatternAnalyzer	Deprecated. (4.0) use the pattern-based analysis in the analysis/pattern package instead.
PatternKeywordMarkerFilter	Marks terms as keywords via the `KeywordAttribute`.
PerFieldAnalyzerWrapper	This analyzer is used to facilitate scenarios where different fields require different analysis techniques.
PrefixAndSuffixAwareTokenFilter	Links two `PrefixAwareTokenFilter`.
PrefixAwareTokenFilter	Joins two token streams and leaves the last token of the first stream available to be used when updating the token values in the second stream based on that token.
RemoveDuplicatesTokenFilter	A TokenFilter which filters out Tokens at the same position and Term text as the previous token in the stream.
RemoveDuplicatesTokenFilterFactory	Factory for `RemoveDuplicatesTokenFilter`.
ScandinavianFoldingFilter	This filter folds Scandinavian characters åÅäæÄÆ->a and öÖøØ->o.
ScandinavianFoldingFilterFactory	Factory for `ScandinavianFoldingFilter`.
ScandinavianNormalizationFilter	This filter normalize use of the interchangeable Scandinavian characters æÆäÄöÖøØ and folded variants (aa, ao, ae, oe and oo) by transforming them to åÅæÆøØ.
ScandinavianNormalizationFilterFactory	Factory for `ScandinavianNormalizationFilter`.
SetKeywordMarkerFilter	Marks terms as keywords via the `KeywordAttribute`.
SingleTokenTokenStream	A `TokenStream` containing a single token.
StemmerOverrideFilter	Provides the ability to override any `KeywordAttribute` aware stemmer with custom dictionary-based stemming.
StemmerOverrideFilter.Builder	This builder builds an `FST` for the `StemmerOverrideFilter`
StemmerOverrideFilter.StemmerOverrideMap	A read-only 4-byte FST backed map that allows fast case-insensitive key value lookups for `StemmerOverrideFilter`
StemmerOverrideFilterFactory	Factory for `StemmerOverrideFilter`.
TrimFilter	Trims leading and trailing whitespace from Tokens in the stream.
TrimFilterFactory	Factory for `TrimFilter`.
WordDelimiterFilter	Splits words into subwords and performs optional transformations on subword groups.
WordDelimiterFilterFactory	Factory for `WordDelimiterFilter`.
WordDelimiterIterator	A BreakIterator-like API for iterating over subwords in text, according to WordDelimiterFilter rules.

Package org.apache.lucene.analysis.miscellaneous Description

Miscellaneous TokenStreams

Overview

Package

Class

Use

Tree

Deprecated

Help

PREV PACKAGE NEXT PACKAGE

FRAMES NO FRAMES