Package org.apache.lucene.analysis.core
Basic, general-purpose analysis components.
-
Class Summary Class Description DecimalDigitFilter Folds all Unicode digits in[:General_Category=Decimal_Number:]
to Basic Latin digits (0-9
).DecimalDigitFilterFactory Factory forDecimalDigitFilter
.FlattenGraphFilter Converts an incoming graph token stream, such as one fromSynonymGraphFilter
, into a flat form so that all nodes form a single linear chain with no side paths.FlattenGraphFilterFactory Factory forFlattenGraphFilter
.KeywordAnalyzer "Tokenizes" the entire stream as a single token.KeywordTokenizer Emits the entire input as a single token.KeywordTokenizerFactory Factory forKeywordTokenizer
.LetterTokenizer A LetterTokenizer is a tokenizer that divides text at non-letters.LetterTokenizerFactory Factory forLetterTokenizer
.LowerCaseFilter Normalizes token text to lower case.LowerCaseFilterFactory Factory forLowerCaseFilter
.SimpleAnalyzer StopAnalyzer StopFilter Removes stop words from a token stream.StopFilterFactory Factory forStopFilter
.TypeTokenFilter Removes tokens whose types appear in a set of blocked types from a token stream.TypeTokenFilterFactory Factory class forTypeTokenFilter
.UnicodeWhitespaceAnalyzer An Analyzer that usesUnicodeWhitespaceTokenizer
.UnicodeWhitespaceTokenizer A UnicodeWhitespaceTokenizer is a tokenizer that divides text at whitespace.UpperCaseFilter Normalizes token text to UPPER CASE.UpperCaseFilterFactory Factory forUpperCaseFilter
.WhitespaceAnalyzer An Analyzer that usesWhitespaceTokenizer
.WhitespaceTokenizer A tokenizer that divides text at whitespace characters as defined byCharacter.isWhitespace(int)
.WhitespaceTokenizerFactory Factory forWhitespaceTokenizer
.