org.apache.lucene.analysis.cn.smart.hhmm
Class SegTokenFilter
java.lang.Object
org.apache.lucene.analysis.cn.smart.hhmm.SegTokenFilter
public class SegTokenFilter
- extends Object
Filters a SegToken
by converting full-width latin to half-width, then lowercasing latin.
Additionally, all punctuation is converted into Utility.COMMON_DELIMITER
- WARNING: This API is experimental and might change in incompatible ways in the next release.
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
SegTokenFilter
public SegTokenFilter()
filter
public SegToken filter(SegToken token)
- Filter an input
SegToken
Full-width latin will be converted to half-width, then all latin will be lowercased.
All punctuation is converted into Utility.COMMON_DELIMITER
- Parameters:
token
- input SegToken
- Returns:
- normalized
SegToken
Copyright © 2000-2011 Apache Software Foundation. All Rights Reserved.