Class SerbianNormalizationFilter

  • All Implemented Interfaces:
    Closeable, AutoCloseable, Unwrappable<TokenStream>

    public final class SerbianNormalizationFilter
    extends TokenFilter
    Normalizes Serbian Cyrillic and Latin characters to "bald" Latin.

    Cyrillic characters are first converted to Latin; then, Latin characters have their diacritics removed, with the exception of đ which is converted to dj.

    Note that it expects lowercased input.