public class KoreanAnalyzer extends Analyzer
KoreanTokenizer
Analyzer.ReuseStrategy, Analyzer.TokenStreamComponents
GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY
Constructor and Description |
---|
KoreanAnalyzer()
Creates a new KoreanAnalyzer.
|
KoreanAnalyzer(UserDictionary userDict,
KoreanTokenizer.DecompoundMode mode,
Set<POS.Tag> stopTags,
boolean outputUnknownUnigrams)
Creates a new KoreanAnalyzer.
|
Modifier and Type | Method and Description |
---|---|
protected Analyzer.TokenStreamComponents |
createComponents(String fieldName) |
protected TokenStream |
normalize(String fieldName,
TokenStream in) |
attributeFactory, close, getOffsetGap, getPositionIncrementGap, getReuseStrategy, getVersion, initReader, initReaderForNormalization, normalize, setVersion, tokenStream, tokenStream
public KoreanAnalyzer()
public KoreanAnalyzer(UserDictionary userDict, KoreanTokenizer.DecompoundMode mode, Set<POS.Tag> stopTags, boolean outputUnknownUnigrams)
userDict
- Optional: if non-null, user dictionary.mode
- Decompound mode.stopTags
- The set of part of speech that should be filtered.outputUnknownUnigrams
- If true outputs unigrams for unknown words.protected Analyzer.TokenStreamComponents createComponents(String fieldName)
createComponents
in class Analyzer
protected TokenStream normalize(String fieldName, TokenStream in)
Copyright © 2000-2021 Apache Software Foundation. All Rights Reserved.