|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.lucene.util.AttributeSource org.apache.lucene.analysis.TokenStream org.apache.lucene.analysis.Tokenizer org.apache.lucene.analysis.cn.ChineseTokenizer
StandardTokenizer
instead, which has the same functionality.
This filter will be removed in Lucene 5.0
@Deprecated public final class ChineseTokenizer
Tokenize Chinese text as individual chinese characters.
The difference between ChineseTokenizer and CJKTokenizer is that they have different token parsing logic.
For example, if the Chinese text "C1C2C3C4" is to be indexed:
Therefore the index created by CJKTokenizer is much larger.
The problem is that when searching for C1, C1C2, C1C3, C4C2, C1C2C3 ... the ChineseTokenizer works, but the CJKTokenizer will not work.
Nested Class Summary |
---|
Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource |
---|
org.apache.lucene.util.AttributeSource.AttributeFactory, org.apache.lucene.util.AttributeSource.State |
Field Summary |
---|
Fields inherited from class org.apache.lucene.analysis.Tokenizer |
---|
input |
Constructor Summary | |
---|---|
ChineseTokenizer(org.apache.lucene.util.AttributeSource.AttributeFactory factory,
Reader in)
Deprecated. |
|
ChineseTokenizer(org.apache.lucene.util.AttributeSource source,
Reader in)
Deprecated. |
|
ChineseTokenizer(Reader in)
Deprecated. |
Method Summary | |
---|---|
void |
end()
Deprecated. |
boolean |
incrementToken()
Deprecated. |
void |
reset()
Deprecated. |
void |
reset(Reader input)
Deprecated. |
Methods inherited from class org.apache.lucene.analysis.Tokenizer |
---|
close, correctOffset |
Methods inherited from class org.apache.lucene.util.AttributeSource |
---|
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toString |
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
public ChineseTokenizer(Reader in)
public ChineseTokenizer(org.apache.lucene.util.AttributeSource source, Reader in)
public ChineseTokenizer(org.apache.lucene.util.AttributeSource.AttributeFactory factory, Reader in)
Method Detail |
---|
public boolean incrementToken() throws IOException
incrementToken
in class org.apache.lucene.analysis.TokenStream
IOException
public final void end()
end
in class org.apache.lucene.analysis.TokenStream
public void reset() throws IOException
reset
in class org.apache.lucene.analysis.TokenStream
IOException
public void reset(Reader input) throws IOException
reset
in class org.apache.lucene.analysis.Tokenizer
IOException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |