- SegToken - Class in org.apache.lucene.analysis.cn.smart.hhmm
-
SmartChineseAnalyzer internal token
- SegToken(char[], int, int, int, int) - Constructor for class org.apache.lucene.analysis.cn.smart.hhmm.SegToken
-
Create a new SegToken from a character array.
- SegTokenFilter - Class in org.apache.lucene.analysis.cn.smart.hhmm
-
Filters a
SegToken
by converting full-width latin to half-width, then lowercasing latin.
- SegTokenFilter() - Constructor for class org.apache.lucene.analysis.cn.smart.hhmm.SegTokenFilter
-
- SENTENCE_BEGIN - Static variable in class org.apache.lucene.analysis.cn.smart.WordType
-
Start of a Sentence
- SENTENCE_END - Static variable in class org.apache.lucene.analysis.cn.smart.WordType
-
End of a Sentence
- SentenceTokenizer - Class in org.apache.lucene.analysis.cn.smart
-
Tokenizes input text into sentences.
- SentenceTokenizer(Reader) - Constructor for class org.apache.lucene.analysis.cn.smart.SentenceTokenizer
-
- SentenceTokenizer(AttributeSource.AttributeFactory, Reader) - Constructor for class org.apache.lucene.analysis.cn.smart.SentenceTokenizer
-
- SmartChineseAnalyzer - Class in org.apache.lucene.analysis.cn.smart
-
SmartChineseAnalyzer is an analyzer for Chinese or mixed Chinese-English text.
- SmartChineseAnalyzer(Version) - Constructor for class org.apache.lucene.analysis.cn.smart.SmartChineseAnalyzer
-
Create a new SmartChineseAnalyzer, using the default stopword list.
- SmartChineseAnalyzer(Version, boolean) - Constructor for class org.apache.lucene.analysis.cn.smart.SmartChineseAnalyzer
-
Create a new SmartChineseAnalyzer, optionally using the default stopword list.
- SmartChineseAnalyzer(Version, CharArraySet) - Constructor for class org.apache.lucene.analysis.cn.smart.SmartChineseAnalyzer
-
Create a new SmartChineseAnalyzer, using the provided
Set
of stopwords.
- SmartChineseSentenceTokenizerFactory - Class in org.apache.lucene.analysis.cn.smart
-
- SmartChineseSentenceTokenizerFactory(Map<String, String>) - Constructor for class org.apache.lucene.analysis.cn.smart.SmartChineseSentenceTokenizerFactory
-
Creates a new SmartChineseSentenceTokenizerFactory
- SmartChineseWordTokenFilterFactory - Class in org.apache.lucene.analysis.cn.smart
-
- SmartChineseWordTokenFilterFactory(Map<String, String>) - Constructor for class org.apache.lucene.analysis.cn.smart.SmartChineseWordTokenFilterFactory
-
Creates a new SmartChineseWordTokenFilterFactory
- SPACE_LIKE - Static variable in class org.apache.lucene.analysis.cn.smart.CharType
-
Characters that act as a space
- SPACES - Static variable in class org.apache.lucene.analysis.cn.smart.Utility
-
Space-like characters that need to be skipped: such as space, tab, newline, carriage return.
- START_CHAR_ARRAY - Static variable in class org.apache.lucene.analysis.cn.smart.Utility
-
- startOffset - Variable in class org.apache.lucene.analysis.cn.smart.hhmm.SegToken
-
start offset into original sentence
- STRING - Static variable in class org.apache.lucene.analysis.cn.smart.WordType
-
ASCII String
- STRING_CHAR_ARRAY - Static variable in class org.apache.lucene.analysis.cn.smart.Utility
-