Class CJKWidthFilter

  • All Implemented Interfaces:
    Closeable, AutoCloseable

    public final class CJKWidthFilter
    extends TokenFilter
    A TokenFilter that normalizes CJK width differences:
    • Folds fullwidth ASCII variants into the equivalent basic latin
    • Folds halfwidth Katakana variants into the equivalent kana

    NOTE: this filter can be viewed as a (practical) subset of NFKC/NFKD Unicode normalization. See the normalization support in the ICU package for full normalization.