Analyzer for Japanese.
Class Summary Class Description GraphvizFormatterOutputs the dot (graphviz) string for the viterbi lattice. JapaneseAnalyzerAnalyzer for Japanese that uses morphological analysis. JapaneseBaseFormFilterReplaces term text with the
JapaneseCompletionAnalyzerAnalyzer for Japanese completion suggester. JapaneseCompletionFilterA
TokenFilterthat adds Japanese romanized tokens to the term attribute.
JapaneseIterationMarkCharFilterNormalizes Japanese horizontal iteration marks (odoriji) to their expanded form. JapaneseIterationMarkCharFilterFactoryFactory for
TokenFilterthat normalizes common katakana spelling variations ending in a long sound character by removing this character (U+30FC).
TokenFilterthat normalizes Japanese numbers (kansūji) to regular Arabic decimal numbers in half-width characters.
JapaneseNumberFilter.NumberBufferBuffer that holds a Japanese number string and a position index used as a parsed-to marker JapaneseNumberFilterFactoryFactory for
JapanesePartOfSpeechStopFilterRemoves tokens that match a set of part-of-speech tags. JapanesePartOfSpeechStopFilterFactoryFactory for
TokenFilterthat replaces the term attribute with the reading of a token in either katakana or romaji form.
JapaneseTokenizerTokenizer for Japanese that uses morphological analysis. JapaneseTokenizerFactoryFactory for
TokenAnalyzed token with morphological data from its dictionary.
Enum Summary Enum Description JapaneseCompletionFilter.ModeCompletion mode JapaneseTokenizer.ModeTokenization mode: this determines how the tokenizer handles compound and unknown words. JapaneseTokenizer.TypeToken type reflecting the original source of this token