Class SnowballFilter

All Implemented Interfaces:
Closeable, AutoCloseable, Unwrappable<TokenStream>

public final class SnowballFilter extends TokenFilter
A filter that stems words using a Snowball-generated stemmer.

Available stemmers are listed in org.tartarus.snowball.ext.

NOTE: SnowballFilter expects lowercased text.

Note: This filter is aware of the KeywordAttribute. To prevent certain terms from being passed to the stemmer KeywordAttribute.isKeyword() should be set to true in a previous TokenStream.

Note: For including the original term as well as the stemmed version, see KeywordRepeatFilterFactory

  • Constructor Details

    • SnowballFilter

      public SnowballFilter(TokenStream input, SnowballStemmer stemmer)
    • SnowballFilter

      public SnowballFilter(TokenStream in, String name)
      Construct the named stemming filter.

      Available stemmers are listed in org.tartarus.snowball.ext. The name of a stemmer is the part of the class name before "Stemmer", e.g., the stemmer in EnglishStemmer is named "English".

      Parameters:
      in - the input tokens to stem
      name - the name of a stemmer
  • Method Details