Package org.apache.lucene.analysis.cn

Analyzer for Chinese, which indexes unigrams (individual chinese characters).

See:
          Description

Class Summary
ChineseAnalyzer An Analyzer that tokenizes text with ChineseTokenizer and filters with ChineseFilter
ChineseFilter A TokenFilter with a stop word table.
ChineseTokenizer Tokenize Chinese text as individual chinese characters.
 

Package org.apache.lucene.analysis.cn Description

Analyzer for Chinese, which indexes unigrams (individual chinese characters).

Three analyzers are provided for Chinese, each of which treats Chinese text in a different way.

Example phrase: "我是中国人"
  1. ChineseAnalyzer: 我-是-中-国-人
  2. CJKAnalyzer: 我是-是中-中国-国人
  3. SmartChineseAnalyzer: 我-是-中国-人



Copyright © 2000-2010 Apache Software Foundation. All Rights Reserved.