All Classes and Interfaces
Class
Description
Attribute for
Token.getBaseForm()
.Attribute for
Token.getBaseForm()
.Character category data.
Utility functions for
JapaneseCompletionFilter
n-gram connection cost data
Tool to build dictionaries.
Format of the dictionary.
Attribute for Kuromoji inflection data.
Attribute for Kuromoji inflection data.
Represents Japanese morphological information.
Analyzer for Japanese that uses morphological analysis.
Replaces term text with the
BaseFormAttribute
.Factory for
JapaneseBaseFormFilter
.Analyzer for Japanese completion suggester.
A
TokenFilter
that adds Japanese romanized tokens to the term
attribute.Completion mode
Factory for
JapaneseCompletionFilter
.A
TokenFilter
that normalizes small letters (捨て仮名) in hiragana into normal letters.Factory for
JapaneseHiraganaUppercaseFilter
.Normalizes Japanese horizontal iteration marks (odoriji) to their expanded form.
Factory for
JapaneseIterationMarkCharFilter
.A
TokenFilter
that normalizes common katakana spelling variations ending in a long sound
character by removing this character (U+30FC).Factory for
JapaneseKatakanaStemFilter
.A
TokenFilter
that normalizes small letters (捨て仮名) in katakana into normal letters.Factory for
JapaneseKatakanaUppercaseFilter
.A
TokenFilter
that normalizes Japanese numbers (kansūji) to regular Arabic decimal
numbers in half-width characters.Buffer that holds a Japanese number string and a position index used as a parsed-to marker
Factory for
JapaneseNumberFilter
.Removes tokens that match a set of part-of-speech tags.
Factory for
JapanesePartOfSpeechStopFilter
.A
TokenFilter
that replaces the term attribute with the
reading of a token in either katakana or romaji form.Factory for
JapaneseReadingFormFilter
.Tokenizer for Japanese that uses morphological analysis.
Tokenization mode: this determines how the tokenizer handles compound and unknown words.
Factory for
JapaneseTokenizer
.Converts a Katakana string to Romaji using the pre-defined
Katakana-Romaji mapping rules.
Attribute for
Token.getPartOfSpeech()
.Attribute for
Token.getPartOfSpeech()
.Attribute for Kuromoji reading data
Attribute for Kuromoji reading data
Analyzed token with morphological data from its dictionary.
Binary dictionary implementation for a known-word dictionary model: Words are encoded into an FST
mapping to a list of wordIDs.
Thin wrapper around an FST with root-arc caching for Japanese.
Utility class for english translations of morphological data, used only for debugging.
Dictionary for unknown-word handling.
Class for building a User Dictionary.