Class SegTokenFilter

public class SegTokenFilter
Filters a SegToken by converting full-width latin to half-width, then lowercasing latin. Additionally, all punctuation is converted into Utility.COMMON_DELIMITER

WARNING: The status of the analyzers/smartcn package is experimental. The APIs and file formats introduced here might change in the future and will not be supported anymore in such a case.

Constructor Summary
Method Summary
 SegToken filter(SegToken token)
          Filter an input SegToken
Constructor Detail


public SegTokenFilter()
Method Detail


public SegToken filter(SegToken token)
Filter an input SegToken

Full-width latin will be converted to half-width, then all latin will be lowercased. All punctuation is converted into Utility.COMMON_DELIMITER

token - input SegToken
normalized SegToken

