Package org.apache.lucene.analysis.core
package org.apache.lucene.analysis.core
Basic, general-purpose analysis components.
-
ClassDescriptionFolds all Unicode digits in
[:General_Category=Decimal_Number:]
to Basic Latin digits (0-9
).Factory forDecimalDigitFilter
.Converts an incoming graph token stream, such as one fromSynonymGraphFilter
, into a flat form so that all nodes form a single linear chain with no side paths.Factory forFlattenGraphFilter
."Tokenizes" the entire stream as a single token.Emits the entire input as a single token.Factory forKeywordTokenizer
.A LetterTokenizer is a tokenizer that divides text at non-letters.Factory forLetterTokenizer
.Normalizes token text to lower case.Factory forLowerCaseFilter
.Removes stop words from a token stream.Factory forStopFilter
.Removes tokens whose types appear in a set of blocked types from a token stream.Factory class forTypeTokenFilter
.An Analyzer that usesUnicodeWhitespaceTokenizer
.A UnicodeWhitespaceTokenizer is a tokenizer that divides text at whitespace.Normalizes token text to UPPER CASE.Factory forUpperCaseFilter
.An Analyzer that usesWhitespaceTokenizer
.A tokenizer that divides text at whitespace characters as defined byCharacter.isWhitespace(int)
.Factory forWhitespaceTokenizer
.