KoreanTokenizerFactory (Lucene 9.11.1 nori API)

java.lang.Object
- org.apache.lucene.analysis.AbstractAnalysisFactory
- - org.apache.lucene.analysis.TokenizerFactory
  - - org.apache.lucene.analysis.ko.KoreanTokenizerFactory

All Implemented Interfaces:

ResourceLoaderAware
```
public class KoreanTokenizerFactory
extends TokenizerFactory
implements ResourceLoaderAware
```
Factory for KoreanTokenizer.
```
 <fieldType name="text_ko" class="solr.TextField">
   <analyzer>
     <tokenizer class="solr.KoreanTokenizerFactory"
                decompoundMode="discard"
                userDictionary="user.txt"
                userDictionaryEncoding="UTF-8"
                outputUnknownUnigrams="false"
                discardPunctuation="true"
     />
  </analyzer>
 </fieldType>
 
```
Supports the following attributes:
- userDictionary: User dictionary path.
- userDictionaryEncoding: User dictionary encoding.
- decompoundMode: Decompound mode. Either 'none', 'discard', 'mixed'. Default is discard. See KoreanTokenizer.DecompoundMode
- outputUnknownUnigrams: If true outputs unigrams for unknown words.
- discardPunctuation: true if punctuation tokens should be dropped from the output.
Since:

7.4.0

WARNING: This API is experimental and might change in incompatible ways in the next release.

SPI Name (case-insensitive: if the name is 'htmlStrip', 'htmlstrip' can be used when looking up the service).

"korean"

Field Summary

Fields
Modifier and Type Field Description

static String NAME
SPI name
- Fields inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
  LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion

Constructor Summary

Constructors
Constructor	Description
`KoreanTokenizerFactory()`	Default ctor for compatibility with SPI
`KoreanTokenizerFactory(Map<String,String> args)`	Creates a new KoreanTokenizerFactory

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type Method Description

KoreanTokenizer create(AttributeFactory factory)

void inform(ResourceLoader loader)
- Methods inherited from class org.apache.lucene.analysis.TokenizerFactory
  availableTokenizers, create, findSPIName, forName, lookupClass, reloadTokenizers
- Methods inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
  defaultCtorException, get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
- Methods inherited from class java.lang.Object
  clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - NAME
```
public static final String NAME
```
    SPI name
    
    See Also:
    
    Constant Field Values
- Constructor Detail
  - KoreanTokenizerFactory
```
public KoreanTokenizerFactory(Map<String,String> args)
```
    Creates a new KoreanTokenizerFactory
  - KoreanTokenizerFactory
```
public KoreanTokenizerFactory()
```
    Default ctor for compatibility with SPI
- Method Detail
  - inform
```
public void inform(ResourceLoader loader)
            throws IOException
```
    Specified by:
    
    inform in interface ResourceLoaderAware
    
    Throws:
    
    IOException
  - create
```
public KoreanTokenizer create(AttributeFactory factory)
```
    Specified by:
    
    create in class TokenizerFactory