Package | Description |
---|---|
org.apache.lucene.analysis.icu.segmentation |
Tokenizer that breaks text into words with the Unicode Text Segmentation algorithm.
|
Modifier and Type | Class and Description |
---|---|
class |
DefaultICUTokenizerConfig
Default
ICUTokenizerConfig that is generally applicable
to many languages. |
Constructor and Description |
---|
ICUTokenizer(AttributeSource.AttributeFactory factory,
Reader input,
ICUTokenizerConfig config)
Construct a new ICUTokenizer that breaks text into words from the given
Reader, using a tailored BreakIterator configuration.
|
ICUTokenizer(Reader input,
ICUTokenizerConfig config)
Construct a new ICUTokenizer that breaks text into words from the given
Reader, using a tailored BreakIterator configuration.
|
Copyright © 2000-2014 Apache Software Foundation. All Rights Reserved.