Class UAX29URLEmailAnalyzer

All Implemented Interfaces:
Closeable, AutoCloseable

public final class UAX29URLEmailAnalyzer extends StopwordAnalyzerBase
Filters UAX29URLEmailTokenizer with LowerCaseFilter and StopFilter, using a list of English stop words.
Since:
3.6.0
  • Field Details

    • DEFAULT_MAX_TOKEN_LENGTH

      public static final int DEFAULT_MAX_TOKEN_LENGTH
      Default maximum allowed token length
      See Also:
    • STOP_WORDS_SET

      public static final CharArraySet STOP_WORDS_SET
      An unmodifiable set containing some common English words that are usually not useful for searching.
  • Constructor Details

    • UAX29URLEmailAnalyzer

      public UAX29URLEmailAnalyzer(CharArraySet stopWords)
      Builds an analyzer with the given stop words.
      Parameters:
      stopWords - stop words
    • UAX29URLEmailAnalyzer

      public UAX29URLEmailAnalyzer()
      Builds an analyzer with the default stop words (STOP_WORDS_SET).
    • UAX29URLEmailAnalyzer

      public UAX29URLEmailAnalyzer(Reader stopwords) throws IOException
      Builds an analyzer with the stop words from the given reader.
      Parameters:
      stopwords - Reader to read stop words from
      Throws:
      IOException
      See Also:
  • Method Details