|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.lucene.util.AttributeSource
org.apache.lucene.analysis.TokenStream
org.apache.lucene.analysis.Tokenizer
org.apache.lucene.analysis.ngram.EdgeNGramTokenizer
public class EdgeNGramTokenizer
Tokenizes the input from an edge into n-grams of given size(s).
This Tokenizer
create n-grams from the beginning edge or ending edge of a input token.
MaxGram can't be larger than 1024 because of limitation.
Nested Class Summary | |
---|---|
static class |
EdgeNGramTokenizer.Side
Specifies which side of the input the n-gram should be generated from |
Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource |
---|
org.apache.lucene.util.AttributeSource.AttributeFactory, org.apache.lucene.util.AttributeSource.State |
Field Summary | |
---|---|
static int |
DEFAULT_MAX_GRAM_SIZE
|
static int |
DEFAULT_MIN_GRAM_SIZE
|
static EdgeNGramTokenizer.Side |
DEFAULT_SIDE
|
Fields inherited from class org.apache.lucene.analysis.Tokenizer |
---|
input |
Constructor Summary | |
---|---|
EdgeNGramTokenizer(org.apache.lucene.util.AttributeSource.AttributeFactory factory,
Reader input,
EdgeNGramTokenizer.Side side,
int minGram,
int maxGram)
Creates EdgeNGramTokenizer that can generate n-grams in the sizes of the given range |
|
EdgeNGramTokenizer(org.apache.lucene.util.AttributeSource.AttributeFactory factory,
Reader input,
String sideLabel,
int minGram,
int maxGram)
Creates EdgeNGramTokenizer that can generate n-grams in the sizes of the given range |
|
EdgeNGramTokenizer(org.apache.lucene.util.AttributeSource source,
Reader input,
EdgeNGramTokenizer.Side side,
int minGram,
int maxGram)
Creates EdgeNGramTokenizer that can generate n-grams in the sizes of the given range |
|
EdgeNGramTokenizer(org.apache.lucene.util.AttributeSource source,
Reader input,
String sideLabel,
int minGram,
int maxGram)
Creates EdgeNGramTokenizer that can generate n-grams in the sizes of the given range |
|
EdgeNGramTokenizer(Reader input,
EdgeNGramTokenizer.Side side,
int minGram,
int maxGram)
Creates EdgeNGramTokenizer that can generate n-grams in the sizes of the given range |
|
EdgeNGramTokenizer(Reader input,
String sideLabel,
int minGram,
int maxGram)
Creates EdgeNGramTokenizer that can generate n-grams in the sizes of the given range |
Method Summary | |
---|---|
void |
end()
|
boolean |
incrementToken()
Returns the next token in the stream, or null at EOS. |
org.apache.lucene.analysis.Token |
next()
Deprecated. Will be removed in Lucene 3.0. This method is final, as it should not be overridden. Delegates to the backwards compatibility layer. |
org.apache.lucene.analysis.Token |
next(org.apache.lucene.analysis.Token reusableToken)
Deprecated. Will be removed in Lucene 3.0. This method is final, as it should not be overridden. Delegates to the backwards compatibility layer. |
void |
reset()
|
void |
reset(Reader input)
|
Methods inherited from class org.apache.lucene.analysis.Tokenizer |
---|
close, correctOffset |
Methods inherited from class org.apache.lucene.analysis.TokenStream |
---|
getOnlyUseNewAPI, setOnlyUseNewAPI |
Methods inherited from class org.apache.lucene.util.AttributeSource |
---|
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, restoreState, toString |
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
public static final EdgeNGramTokenizer.Side DEFAULT_SIDE
public static final int DEFAULT_MAX_GRAM_SIZE
public static final int DEFAULT_MIN_GRAM_SIZE
Constructor Detail |
---|
public EdgeNGramTokenizer(Reader input, EdgeNGramTokenizer.Side side, int minGram, int maxGram)
input
- Reader
holding the input to be tokenizedside
- the EdgeNGramTokenizer.Side
from which to chop off an n-gramminGram
- the smallest n-gram to generatemaxGram
- the largest n-gram to generatepublic EdgeNGramTokenizer(org.apache.lucene.util.AttributeSource source, Reader input, EdgeNGramTokenizer.Side side, int minGram, int maxGram)
source
- AttributeSource
to useinput
- Reader
holding the input to be tokenizedside
- the EdgeNGramTokenizer.Side
from which to chop off an n-gramminGram
- the smallest n-gram to generatemaxGram
- the largest n-gram to generatepublic EdgeNGramTokenizer(org.apache.lucene.util.AttributeSource.AttributeFactory factory, Reader input, EdgeNGramTokenizer.Side side, int minGram, int maxGram)
factory
- AttributeSource.AttributeFactory
to useinput
- Reader
holding the input to be tokenizedside
- the EdgeNGramTokenizer.Side
from which to chop off an n-gramminGram
- the smallest n-gram to generatemaxGram
- the largest n-gram to generatepublic EdgeNGramTokenizer(Reader input, String sideLabel, int minGram, int maxGram)
input
- Reader
holding the input to be tokenizedsideLabel
- the name of the EdgeNGramTokenizer.Side
from which to chop off an n-gramminGram
- the smallest n-gram to generatemaxGram
- the largest n-gram to generatepublic EdgeNGramTokenizer(org.apache.lucene.util.AttributeSource source, Reader input, String sideLabel, int minGram, int maxGram)
source
- AttributeSource
to useinput
- Reader
holding the input to be tokenizedsideLabel
- the name of the EdgeNGramTokenizer.Side
from which to chop off an n-gramminGram
- the smallest n-gram to generatemaxGram
- the largest n-gram to generatepublic EdgeNGramTokenizer(org.apache.lucene.util.AttributeSource.AttributeFactory factory, Reader input, String sideLabel, int minGram, int maxGram)
factory
- AttributeSource.AttributeFactory
to useinput
- Reader
holding the input to be tokenizedsideLabel
- the name of the EdgeNGramTokenizer.Side
from which to chop off an n-gramminGram
- the smallest n-gram to generatemaxGram
- the largest n-gram to generateMethod Detail |
---|
public final boolean incrementToken() throws IOException
incrementToken
in class org.apache.lucene.analysis.TokenStream
IOException
public final void end()
end
in class org.apache.lucene.analysis.TokenStream
public final org.apache.lucene.analysis.Token next(org.apache.lucene.analysis.Token reusableToken) throws IOException
next
in class org.apache.lucene.analysis.TokenStream
IOException
public final org.apache.lucene.analysis.Token next() throws IOException
next
in class org.apache.lucene.analysis.TokenStream
IOException
public void reset(Reader input) throws IOException
reset
in class org.apache.lucene.analysis.Tokenizer
IOException
public void reset() throws IOException
reset
in class org.apache.lucene.analysis.TokenStream
IOException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |