|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.lucene.util.AttributeSource org.apache.lucene.analysis.TokenStream org.apache.lucene.analysis.Tokenizer
public abstract class Tokenizer
A Tokenizer is a TokenStream whose input is a Reader.
This is an abstract class; subclasses must override TokenStream.incrementToken()
NOTE: Subclasses overriding TokenStream.incrementToken()
must
call AttributeSource.clearAttributes()
before
setting attributes.
Nested Class Summary |
---|
Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource |
---|
AttributeSource.AttributeFactory, AttributeSource.State |
Field Summary | |
---|---|
protected Reader |
input
The text source for this Tokenizer. |
Constructor Summary | |
---|---|
protected |
Tokenizer()
Construct a tokenizer with null input. |
protected |
Tokenizer(AttributeSource.AttributeFactory factory)
Construct a tokenizer with null input using the given AttributeFactory. |
protected |
Tokenizer(AttributeSource.AttributeFactory factory,
Reader input)
Construct a token stream processing the given input using the given AttributeFactory. |
protected |
Tokenizer(AttributeSource source)
Construct a token stream processing the given input using the given AttributeSource. |
protected |
Tokenizer(AttributeSource source,
Reader input)
Construct a token stream processing the given input using the given AttributeSource. |
protected |
Tokenizer(Reader input)
Construct a token stream processing the given input. |
Method Summary | |
---|---|
void |
close()
By default, closes the input Reader. |
protected int |
correctOffset(int currentOff)
Return the corrected offset. |
void |
reset(Reader input)
Expert: Reset the tokenizer to a new reader. |
Methods inherited from class org.apache.lucene.analysis.TokenStream |
---|
end, incrementToken, reset |
Methods inherited from class org.apache.lucene.util.AttributeSource |
---|
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toString |
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
protected Reader input
Constructor Detail |
---|
protected Tokenizer()
protected Tokenizer(Reader input)
protected Tokenizer(AttributeSource.AttributeFactory factory)
protected Tokenizer(AttributeSource.AttributeFactory factory, Reader input)
protected Tokenizer(AttributeSource source)
protected Tokenizer(AttributeSource source, Reader input)
Method Detail |
---|
public void close() throws IOException
close
in interface Closeable
close
in class TokenStream
IOException
protected final int correctOffset(int currentOff)
input
is a CharStream
subclass
this method calls CharStream.correctOffset(int)
, else returns currentOff
.
currentOff
- offset as seen in the output
CharStream.correctOffset(int)
public void reset(Reader input) throws IOException
IOException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |