All Classes and Interfaces
Class
Description
Base class for a binary-encoded in-memory dictionary.
Deprecated, for removal: This API element is subject to removal in a future version.
Character category data.
n-gram connection cost data
A token that was generated from a compound.
Dictionary interface for retrieving morphological data by id.
A morpheme extracted from a compound token.
Tool to build dictionaries.
A token stored in a
Dictionary
.Outputs the dot (graphviz) string for the viterbi lattice.
Analyzer for Korean that uses morphological analysis.
A
TokenFilter
that normalizes Korean numbers to regular Arabic decimal numbers in
half-width characters.Buffer that holds a Korean number string and a position index used as a parsed-to marker
Factory for
KoreanNumberFilter
.Removes tokens that match a set of part-of-speech tags.
Factory for
KoreanPartOfSpeechStopFilter
.Replaces term text with the
ReadingAttribute
which is the Hangul transcription of Hanja
characters.Factory for
KoreanReadingFormFilter
.Tokenizer for Korean that uses morphological analysis.
Decompound mode: this determines how the tokenizer handles
COMPOUND
, INFLECT
and PREANALYSIS
tokens.Token type reflecting the original source of this token
Factory for
KoreanTokenizer
.Part of Speech attributes for Korean.
Part of Speech attributes for Korean.
Part of speech classification for Korean based on Sejong corpus classification.
Part of speech tag for Korean based on Sejong corpus classification.
The type of the token.
Attribute for Korean reading data
Attribute for Korean reading data
Analyzed token with morphological data.
Binary dictionary implementation for a known-word dictionary model: Words are encoded into an FST
mapping to a list of wordIDs.
Thin wrapper around an FST with root-arc caching for Hangul syllables (11,172 arcs).
Dictionary for unknown-word handling.
Class for building a User Dictionary.