Class EdgeNGramTokenizer

  • All Implemented Interfaces:
    Closeable, AutoCloseable

    public class EdgeNGramTokenizer
    extends NGramTokenizer
    Tokenizes the input from an edge into n-grams of given size(s).

    This Tokenizer create n-grams from the beginning edge of a input token.

    As of Lucene 4.4, this class supports pre-tokenization and correctly handles supplementary characters.

    • Constructor Detail

      • EdgeNGramTokenizer

        public EdgeNGramTokenizer​(int minGram,
                                  int maxGram)
        Creates EdgeNGramTokenizer that can generate n-grams in the sizes of the given range
        Parameters:
        minGram - the smallest n-gram to generate
        maxGram - the largest n-gram to generate
      • EdgeNGramTokenizer

        public EdgeNGramTokenizer​(AttributeFactory factory,
                                  int minGram,
                                  int maxGram)
        Creates EdgeNGramTokenizer that can generate n-grams in the sizes of the given range
        Parameters:
        factory - AttributeFactory to use
        minGram - the smallest n-gram to generate
        maxGram - the largest n-gram to generate