Class EdgeNGramTokenizer

All Implemented Interfaces:
Closeable, AutoCloseable

public class EdgeNGramTokenizer extends NGramTokenizer
Tokenizes the input from an edge into n-grams of given size(s).

This Tokenizer create n-grams from the beginning edge of a input token.

As of Lucene 4.4, this class supports pre-tokenization and correctly handles supplementary characters.

  • Field Details

  • Constructor Details

    • EdgeNGramTokenizer

      public EdgeNGramTokenizer(int minGram, int maxGram)
      Creates EdgeNGramTokenizer that can generate n-grams in the sizes of the given range
      Parameters:
      minGram - the smallest n-gram to generate
      maxGram - the largest n-gram to generate
    • EdgeNGramTokenizer

      public EdgeNGramTokenizer(AttributeFactory factory, int minGram, int maxGram)
      Creates EdgeNGramTokenizer that can generate n-grams in the sizes of the given range
      Parameters:
      factory - AttributeFactory to use
      minGram - the smallest n-gram to generate
      maxGram - the largest n-gram to generate