SegTokenFilter
SegToken
Full-width latin will be converted to half-width, then all latin will be lowercased.CharType
constant of a given character.SegToken
representing the best segmentation of a sentenceSegToken
by converting full-width latin to half-width, then lowercasing latin.Set
of stopwords.TokenFilter
that breaks sentences into words.WordType
of the text