Class SerbianNormalizationFilter

  • All Implemented Interfaces:
    Closeable, AutoCloseable

    public final class SerbianNormalizationFilter
    extends TokenFilter
    Normalizes Serbian Cyrillic and Latin characters to "bald" Latin. Cyrillic characters are first converted to Latin; then, Latin characters have their diacritics removed, with the exception of đ which is converted to dj. Note that it expects lowercased input.