Analyzer for Simplified Chinese, which indexes words.


Class Summary
AnalyzerProfile Manages analysis data configuration for SmartChineseAnalyzer
CharType Internal SmartChineseAnalyzer character type constants.
SentenceTokenizer Tokenizes input text into sentences.
SmartChineseAnalyzer SmartChineseAnalyzer is an analyzer for Chinese or mixed Chinese-English text.
Utility SmartChineseAnalyzer utility constants and methods
WordTokenFilter A TokenFilter that breaks sentences into words.
WordType Internal SmartChineseAnalyzer token type constants

Package Description

Analyzer for Simplified Chinese, which indexes words.

WARNING: This API is experimental and might change in incompatible ways in the next release.
Three analyzers are provided for Chinese, each of which treats Chinese text in a different way. Example phrase: "我是中国人"
  1. StandardAnalyzer: 我-是-中-国-人
  2. CJKAnalyzer: 我是-是中-中国-国人
  3. SmartChineseAnalyzer: 我-是-中国-人

Copyright © 2000-2011 Apache Software Foundation. All Rights Reserved.