org.apache.lucene.analysis.cn
Class ChineseTokenizer

java.lang.Object
  extended by org.apache.lucene.util.AttributeSource
      extended by org.apache.lucene.analysis.TokenStream
          extended by org.apache.lucene.analysis.Tokenizer
              extended by org.apache.lucene.analysis.cn.ChineseTokenizer
All Implemented Interfaces:
Closeable

Deprecated. (3.1) Use StandardTokenizer instead, which has the same functionality. This filter will be removed in Lucene 5.0

@Deprecated
public final class ChineseTokenizer
extends Tokenizer

Tokenize Chinese text as individual chinese characters.

The difference between ChineseTokenizer and CJKTokenizer is that they have different token parsing logic.

For example, if the Chinese text "C1C2C3C4" is to be indexed:

Therefore the index created by CJKTokenizer is much larger.

The problem is that when searching for C1, C1C2, C1C3, C4C2, C1C2C3 ... the ChineseTokenizer works, but the CJKTokenizer will not work.


Nested Class Summary
 
Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource
AttributeSource.AttributeFactory, AttributeSource.State
 
Field Summary
 
Fields inherited from class org.apache.lucene.analysis.Tokenizer
input
 
Constructor Summary
ChineseTokenizer(AttributeSource.AttributeFactory factory, Reader in)
          Deprecated.  
ChineseTokenizer(AttributeSource source, Reader in)
          Deprecated.  
ChineseTokenizer(Reader in)
          Deprecated.  
 
Method Summary
 void end()
          Deprecated.  
 boolean incrementToken()
          Deprecated.  
 void reset()
          Deprecated.  
 
Methods inherited from class org.apache.lucene.analysis.Tokenizer
close, correctOffset, setReader
 
Methods inherited from class org.apache.lucene.util.AttributeSource
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ChineseTokenizer

public ChineseTokenizer(Reader in)
Deprecated. 

ChineseTokenizer

public ChineseTokenizer(AttributeSource source,
                        Reader in)
Deprecated. 

ChineseTokenizer

public ChineseTokenizer(AttributeSource.AttributeFactory factory,
                        Reader in)
Deprecated. 
Method Detail

incrementToken

public boolean incrementToken()
                       throws IOException
Deprecated. 
Specified by:
incrementToken in class TokenStream
Throws:
IOException

end

public final void end()
Deprecated. 
Overrides:
end in class TokenStream

reset

public void reset()
           throws IOException
Deprecated. 
Overrides:
reset in class TokenStream
Throws:
IOException


Copyright © 2000-2013 Apache Software Foundation. All Rights Reserved.