|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.apache.lucene.util.AttributeSource
org.apache.lucene.analysis.TokenStream
org.apache.lucene.analysis.Tokenizer
public abstract class Tokenizer
A Tokenizer is a TokenStream whose input is a Reader.
This is an abstract class; subclasses must override TokenStream.incrementToken()
NOTE: Subclasses overriding TokenStream.incrementToken() must
call AttributeSource.clearAttributes() before
setting attributes.
| Nested Class Summary |
|---|
| Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource |
|---|
AttributeSource.AttributeFactory, AttributeSource.State |
| Field Summary | |
|---|---|
protected Reader |
input
The text source for this Tokenizer. |
| Constructor Summary | |
|---|---|
protected |
Tokenizer()
Construct a tokenizer with null input. |
protected |
Tokenizer(AttributeSource.AttributeFactory factory)
Construct a tokenizer with null input using the given AttributeFactory. |
protected |
Tokenizer(AttributeSource.AttributeFactory factory,
Reader input)
Construct a token stream processing the given input using the given AttributeFactory. |
protected |
Tokenizer(AttributeSource source)
Construct a token stream processing the given input using the given AttributeSource. |
protected |
Tokenizer(AttributeSource source,
Reader input)
Construct a token stream processing the given input using the given AttributeSource. |
protected |
Tokenizer(Reader input)
Construct a token stream processing the given input. |
| Method Summary | |
|---|---|
void |
close()
By default, closes the input Reader. |
protected int |
correctOffset(int currentOff)
Return the corrected offset. |
void |
reset(Reader input)
Expert: Reset the tokenizer to a new reader. |
| Methods inherited from class org.apache.lucene.analysis.TokenStream |
|---|
end, incrementToken, reset |
| Methods inherited from class org.apache.lucene.util.AttributeSource |
|---|
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, restoreState, toString |
| Methods inherited from class java.lang.Object |
|---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
| Field Detail |
|---|
protected Reader input
| Constructor Detail |
|---|
protected Tokenizer()
protected Tokenizer(Reader input)
protected Tokenizer(AttributeSource.AttributeFactory factory)
protected Tokenizer(AttributeSource.AttributeFactory factory,
Reader input)
protected Tokenizer(AttributeSource source)
protected Tokenizer(AttributeSource source,
Reader input)
| Method Detail |
|---|
public void close()
throws IOException
close in interface Closeableclose in class TokenStreamIOExceptionprotected final int correctOffset(int currentOff)
input is a CharStream subclass
this method calls CharStream.correctOffset(int), else returns currentOff.
currentOff - offset as seen in the output
CharStream.correctOffset(int)
public void reset(Reader input)
throws IOException
IOException
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||