Class HyphenationCompoundWordTokenFilter

  • All Implemented Interfaces:
    Closeable, AutoCloseable, Unwrappable<TokenStream>

    public class HyphenationCompoundWordTokenFilter
    extends CompoundWordTokenFilterBase
    A TokenFilter that decomposes compound words found in many Germanic languages.

    "Donaudampfschiff" becomes Donau, dampf, schiff so that you can find "Donaudampfschiff" even when you only enter "schiff". It uses a hyphenation grammar and a word dictionary to achieve this.