Class SegTokenFilter
java.lang.Object
org.apache.lucene.analysis.cn.smart.hhmm.SegTokenFilter
Filters a
SegToken
by converting full-width latin to half-width, then lowercasing latin.
Additionally, all punctuation is converted into Utility.COMMON_DELIMITER
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
Constructor Summary
-
Method Summary
-
Constructor Details
-
SegTokenFilter
public SegTokenFilter()
-
-
Method Details
-
filter
Filter an inputSegToken
Full-width latin will be converted to half-width, then all latin will be lowercased. All punctuation is converted into
Utility.COMMON_DELIMITER
-