Class GermanNormalizationFilter

All Implemented Interfaces:
Closeable, AutoCloseable, Unwrappable<TokenStream>

public final class GermanNormalizationFilter extends TokenFilter
Normalizes German characters according to the heuristics of the German2 snowball algorithm. It allows for the fact that ä, ö and ü are sometimes written as ae, oe and ue.
  • 'ß' is replaced by 'ss'
  • 'ä', 'ö', 'ü' are replaced by 'a', 'o', 'u', respectively.
  • 'ae' and 'oe' are replaced by 'a', and 'o', respectively.
  • 'ue' is replaced by 'u', when not following a vowel or q.
This is useful if you want this normalization without using the German2 stemmer, or perhaps no stemming at all.