org.apache.lucene.analysis.cn (Lucene 3.6.0 API)

All Classes

Class Summary
Class	Description
ChineseAnalyzer	Deprecated Use `StandardAnalyzer` instead, which has the same functionality.
ChineseFilter	Deprecated Use `StopFilter` instead, which has the same functionality.
ChineseTokenizer	Deprecated Use `StandardTokenizer` instead, which has the same functionality.

Package org.apache.lucene.analysis.cn Description

Analyzer for Chinese, which indexes unigrams (individual chinese characters).

Three analyzers are provided for Chinese, each of which treats Chinese text in a different way.

StandardAnalyzer: Index unigrams (individual Chinese characters) as a token.
CJKAnalyzer (in the analyzers/cjk package): Index bigrams (overlapping groups of two adjacent Chinese characters) as tokens.
SmartChineseAnalyzer (in the analyzers/smartcn package): Index words (attempt to segment Chinese text into words) as tokens.

Example phrase： "我是中国人"

StandardAnalyzer: 我－是－中－国－人
CJKAnalyzer: 我是－是中－中国－国人
SmartChineseAnalyzer: 我－是－中国－人

All Classes