Package org.apache.lucene.analysis.ko
Class KoreanTokenizerFactory
- java.lang.Object
-
- org.apache.lucene.analysis.AbstractAnalysisFactory
-
- org.apache.lucene.analysis.TokenizerFactory
-
- org.apache.lucene.analysis.ko.KoreanTokenizerFactory
-
- All Implemented Interfaces:
ResourceLoaderAware
public class KoreanTokenizerFactory extends TokenizerFactory implements ResourceLoaderAware
Factory forKoreanTokenizer
.<fieldType name="text_ko" class="solr.TextField"> <analyzer> <tokenizer class="solr.KoreanTokenizerFactory" decompoundMode="discard" userDictionary="user.txt" userDictionaryEncoding="UTF-8" outputUnknownUnigrams="false" discardPunctuation="true" /> </analyzer> </fieldType>
Supports the following attributes:
- userDictionary: User dictionary path.
- userDictionaryEncoding: User dictionary encoding.
- decompoundMode: Decompound mode. Either 'none', 'discard', 'mixed'. Default is discard. See
KoreanTokenizer.DecompoundMode
- outputUnknownUnigrams: If true outputs unigrams for unknown words.
- discardPunctuation: true if punctuation tokens should be dropped from the output.
- Since:
- 7.4.0
- WARNING: This API is experimental and might change in incompatible ways in the next release.
- SPI Name (case-insensitive: if the name is 'htmlStrip', 'htmlstrip' can be used when looking up the service).
- "korean"
-
-
Field Summary
Fields Modifier and Type Field Description static String
NAME
SPI name-
Fields inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
-
-
Constructor Summary
Constructors Constructor Description KoreanTokenizerFactory()
Default ctor for compatibility with SPIKoreanTokenizerFactory(Map<String,String> args)
Creates a new KoreanTokenizerFactory
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description KoreanTokenizer
create(AttributeFactory factory)
void
inform(ResourceLoader loader)
-
Methods inherited from class org.apache.lucene.analysis.TokenizerFactory
availableTokenizers, create, findSPIName, forName, lookupClass, reloadTokenizers
-
Methods inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
defaultCtorException, get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
-
-
-
-
Field Detail
-
NAME
public static final String NAME
SPI name- See Also:
- Constant Field Values
-
-
Method Detail
-
inform
public void inform(ResourceLoader loader) throws IOException
- Specified by:
inform
in interfaceResourceLoaderAware
- Throws:
IOException
-
create
public KoreanTokenizer create(AttributeFactory factory)
- Specified by:
create
in classTokenizerFactory
-
-