SegTokenFilter
SegToken
CharType
constant of a given character.SegToken
representing the best segmentation of a sentenceSegToken
by converting full-width latin to half-width, then lowercasing latin.Set
of stopwords.TokenFilter
that breaks sentences into words.WordType
of the text