Package | Description |
---|---|
org.apache.lucene.analysis.icu.segmentation |
Tokenizer that breaks text into words with the Unicode Text Segmentation algorithm.
|
Modifier and Type | Class and Description |
---|---|
class |
DefaultICUTokenizerConfig
Default
ICUTokenizerConfig that is generally applicable
to many languages. |
Constructor and Description |
---|
ICUTokenizer(AttributeFactory factory,
ICUTokenizerConfig config)
Construct a new ICUTokenizer that breaks text into words from the given
Reader, using a tailored BreakIterator configuration.
|
ICUTokenizer(ICUTokenizerConfig config)
Construct a new ICUTokenizer that breaks text into words from the given
Reader, using a tailored BreakIterator configuration.
|
Copyright © 2000-2015 Apache Software Foundation. All Rights Reserved.