HMMChineseTokenizerFactory (Lucene 5.2.0 API)

All Classes

Summary:
Nested |
Field |
Constr |
Method

Detail:
Field |
Constr |
Method

java.lang.Object
- org.apache.lucene.analysis.util.AbstractAnalysisFactory
- - org.apache.lucene.analysis.util.TokenizerFactory
  - - org.apache.lucene.analysis.cn.smart.HMMChineseTokenizerFactory

```
public final class HMMChineseTokenizerFactory
extends TokenizerFactory
```
Factory for HMMChineseTokenizer
Note: this class will currently emit tokens for punctuation. So you should either add a WordDelimiterFilter after to remove these (with concatenate off), or use the SmartChinese stoplist with a StopFilterFactory via: words="org/apache/lucene/analysis/cn/smart/stopwords.txt"

WARNING: This API is experimental and might change in incompatible ways in the next release.

- Field Summary
  - Fields inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory
    LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
- Constructor Summary
  
  Constructors
  Constructor and Description
  
  HMMChineseTokenizerFactory(Map<String,String> args)
  Creates a new HMMChineseTokenizerFactory
- Method Summary
  
  Methods
  Modifier and Type Method and Description
  
  Tokenizer create(AttributeFactory factory)
  - Methods inherited from class org.apache.lucene.analysis.util.TokenizerFactory
    availableTokenizers, create, forName, lookupClass, reloadTokenizers
  - Methods inherited from class org.apache.lucene.analysis.util.AbstractAnalysisFactory
    get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitFileNames
  - Methods inherited from class java.lang.Object
    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - HMMChineseTokenizerFactory
```
public HMMChineseTokenizerFactory(Map<String,String> args)
```
    Creates a new HMMChineseTokenizerFactory
- Method Detail
  - create
```
public Tokenizer create(AttributeFactory factory)
```
    Specified by:
    
    create in class TokenizerFactory

All Classes

Summary:
Nested |
Field |
Constr |
Method

Detail:
Field |
Constr |
Method

Copyright © 2000-2015 Apache Software Foundation. All Rights Reserved.