org.apache.lucene.analysis.ko (Lucene 9.2.0 nori API)

Analyzer for Korean.

Class Summary
Class	Description
DecompoundToken	A token that was generated from a compound.
DictionaryToken	A token stored in a `Dictionary`.
GraphvizFormatter	Outputs the dot (graphviz) string for the viterbi lattice.
KoreanAnalyzer	Analyzer for Korean that uses morphological analysis.
KoreanNumberFilter	A `TokenFilter` that normalizes Korean numbers to regular Arabic decimal numbers in half-width characters.
KoreanNumberFilter.NumberBuffer	Buffer that holds a Korean number string and a position index used as a parsed-to marker
KoreanNumberFilterFactory	Factory for `KoreanNumberFilter`.
KoreanPartOfSpeechStopFilter	Removes tokens that match a set of part-of-speech tags.
KoreanPartOfSpeechStopFilterFactory	Factory for `KoreanPartOfSpeechStopFilter`.
KoreanReadingFormFilter	Replaces term text with the `ReadingAttribute` which is the Hangul transcription of Hanja characters.
KoreanReadingFormFilterFactory	Factory for `KoreanReadingFormFilter`.
KoreanTokenizer	Tokenizer for Korean that uses morphological analysis.
KoreanTokenizerFactory	Factory for `KoreanTokenizer`.
POS	Part of speech classification for Korean based on Sejong corpus classification.
Token	Analyzed token with morphological data.

Enum Summary
Enum	Description
KoreanTokenizer.DecompoundMode	Decompound mode: this determines how the tokenizer handles `POS.Type.COMPOUND`, `POS.Type.INFLECT` and `POS.Type.PREANALYSIS` tokens.
KoreanTokenizer.Type	Token type reflecting the original source of this token
POS.Tag	Part of speech tag for Korean based on Sejong corpus classification.
POS.Type	The type of the token.

Package org.apache.lucene.analysis.ko