EdgeNGramTokenizer (Lucene 3.4.0 API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.lucene.analysis.ngram
Class EdgeNGramTokenizer

java.lang.Object
  org.apache.lucene.util.AttributeSource
      org.apache.lucene.analysis.TokenStream
          org.apache.lucene.analysis.Tokenizer
              org.apache.lucene.analysis.ngram.EdgeNGramTokenizer

All Implemented Interfaces:: Closeable

public final class EdgeNGramTokenizer
extends Tokenizer
extends Tokenizer

Tokenizes the input from an edge into n-grams of given size(s).

This Tokenizer create n-grams from the beginning edge or ending edge of a input token. MaxGram can't be larger than 1024 because of limitation.

Nested Class Summary
`static class`	`EdgeNGramTokenizer.Side` Specifies which side of the input the n-gram should be generated from

Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource
`AttributeSource.AttributeFactory, AttributeSource.State`

Field Summary
`static int`	`DEFAULT_MAX_GRAM_SIZE`
`static int`	`DEFAULT_MIN_GRAM_SIZE`
`static EdgeNGramTokenizer.Side`	`DEFAULT_SIDE`

Fields inherited from class org.apache.lucene.analysis.Tokenizer
`input`

Constructor Summary
`EdgeNGramTokenizer(AttributeSource.AttributeFactory factory, Reader input, EdgeNGramTokenizer.Side side, int minGram, int maxGram)` Creates EdgeNGramTokenizer that can generate n-grams in the sizes of the given range
`EdgeNGramTokenizer(AttributeSource.AttributeFactory factory, Reader input, String sideLabel, int minGram, int maxGram)` Creates EdgeNGramTokenizer that can generate n-grams in the sizes of the given range
`EdgeNGramTokenizer(AttributeSource source, Reader input, EdgeNGramTokenizer.Side side, int minGram, int maxGram)` Creates EdgeNGramTokenizer that can generate n-grams in the sizes of the given range
`EdgeNGramTokenizer(AttributeSource source, Reader input, String sideLabel, int minGram, int maxGram)` Creates EdgeNGramTokenizer that can generate n-grams in the sizes of the given range
`EdgeNGramTokenizer(Reader input, EdgeNGramTokenizer.Side side, int minGram, int maxGram)` Creates EdgeNGramTokenizer that can generate n-grams in the sizes of the given range
`EdgeNGramTokenizer(Reader input, String sideLabel, int minGram, int maxGram)` Creates EdgeNGramTokenizer that can generate n-grams in the sizes of the given range

Method Summary
`void`	`end()` This method is called by the consumer after the last token has been consumed, after `TokenStream.incrementToken()` returned `false` (using the new `TokenStream` API).
`boolean`	`incrementToken()` Returns the next token in the stream, or null at EOS.
`void`	`reset()` Resets this stream to the beginning.
`void`	`reset(Reader input)` Expert: Reset the tokenizer to a new reader.

Methods inherited from class org.apache.lucene.analysis.Tokenizer
`close, correctOffset`

Methods inherited from class org.apache.lucene.util.AttributeSource
`addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toString`

Methods inherited from class java.lang.Object
`clone, finalize, getClass, notify, notifyAll, wait, wait, wait`

Field Detail

DEFAULT_SIDE

public static final EdgeNGramTokenizer.Side DEFAULT_SIDE

DEFAULT_MAX_GRAM_SIZE

public static final int DEFAULT_MAX_GRAM_SIZE

See Also:: Constant Field Values

DEFAULT_MIN_GRAM_SIZE

public static final int DEFAULT_MIN_GRAM_SIZE

See Also:: Constant Field Values

Constructor Detail