Package org.apache.lucene.analysis.ja
package org.apache.lucene.analysis.ja
Analyzer for Japanese.
-
ClassDescriptionAnalyzer for Japanese that uses morphological analysis.Replaces term text with the
BaseFormAttribute
.Factory forJapaneseBaseFormFilter
.Analyzer for Japanese completion suggester.ATokenFilter
that adds Japanese romanized tokens to the term attribute.Completion modeFactory forJapaneseCompletionFilter
.ATokenFilter
that normalizes small letters (捨て仮名) in hiragana into normal letters.Factory forJapaneseHiraganaUppercaseFilter
.Normalizes Japanese horizontal iteration marks (odoriji) to their expanded form.Factory forJapaneseIterationMarkCharFilter
.ATokenFilter
that normalizes common katakana spelling variations ending in a long sound character by removing this character (U+30FC).Factory forJapaneseKatakanaStemFilter
.ATokenFilter
that normalizes small letters (捨て仮名) in katakana into normal letters.Factory forJapaneseKatakanaUppercaseFilter
.ATokenFilter
that normalizes Japanese numbers (kansūji) to regular Arabic decimal numbers in half-width characters.Buffer that holds a Japanese number string and a position index used as a parsed-to markerFactory forJapaneseNumberFilter
.Removes tokens that match a set of part-of-speech tags.Factory forJapanesePartOfSpeechStopFilter
.ATokenFilter
that replaces the term attribute with the reading of a token in either katakana or romaji form.Factory forJapaneseReadingFormFilter
.Tokenizer for Japanese that uses morphological analysis.Tokenization mode: this determines how the tokenizer handles compound and unknown words.Factory forJapaneseTokenizer
.Analyzed token with morphological data from its dictionary.