org.apache.lucene.analysis.cn
Class ChineseTokenizer
java.lang.Object
org.apache.lucene.util.AttributeSource
org.apache.lucene.analysis.TokenStream
org.apache.lucene.analysis.Tokenizer
org.apache.lucene.analysis.cn.ChineseTokenizer
- All Implemented Interfaces:
- Closeable
Deprecated. (3.1) Use StandardTokenizer
instead, which has the same functionality.
This filter will be removed in Lucene 5.0
@Deprecated
public final class ChineseTokenizer
- extends Tokenizer
Tokenize Chinese text as individual chinese characters.
The difference between ChineseTokenizer and
CJKTokenizer is that they have different
token parsing logic.
For example, if the Chinese text
"C1C2C3C4" is to be indexed:
- The tokens returned from ChineseTokenizer are C1, C2, C3, C4.
- The tokens returned from the CJKTokenizer are C1C2, C2C3, C3C4.
Therefore the index created by CJKTokenizer is much larger.
The problem is that when searching for C1, C1C2, C1C3,
C4C2, C1C2C3 ... the ChineseTokenizer works, but the
CJKTokenizer will not work.
Fields inherited from class org.apache.lucene.analysis.Tokenizer |
input |
Methods inherited from class org.apache.lucene.util.AttributeSource |
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState |
ChineseTokenizer
public ChineseTokenizer(Reader in)
- Deprecated.
ChineseTokenizer
public ChineseTokenizer(AttributeSource source,
Reader in)
- Deprecated.
ChineseTokenizer
public ChineseTokenizer(AttributeSource.AttributeFactory factory,
Reader in)
- Deprecated.
incrementToken
public boolean incrementToken()
throws IOException
- Deprecated.
- Specified by:
incrementToken
in class TokenStream
- Throws:
IOException
end
public final void end()
- Deprecated.
- Overrides:
end
in class TokenStream
reset
public void reset()
throws IOException
- Deprecated.
- Overrides:
reset
in class TokenStream
- Throws:
IOException
Copyright © 2000-2013 Apache Software Foundation. All Rights Reserved.