Index
All Classes and Interfaces|All Packages|Constant Field Values
B
- Backwards Compatibility - Search tag in Overview
- Section
C
- Case Folding - Search tag in Overview
- Section
- Caveats and Comparisons - Search tag in Overview
- Section
- clear() - Method in class org.apache.lucene.analysis.icu.tokenattributes.ScriptAttributeImpl
- Collation - Search tag in Overview
- Section
- combineCJ() - Method in class org.apache.lucene.analysis.icu.segmentation.DefaultICUTokenizerConfig
- combineCJ() - Method in class org.apache.lucene.analysis.icu.segmentation.ICUTokenizerConfig
-
true if Han, Hiragana, and Katakana scripts should all be returned as Japanese
- Convert Traditional to Simplified - Search tag in Overview
- Section
- copyTo(AttributeImpl) - Method in class org.apache.lucene.analysis.icu.tokenattributes.ScriptAttributeImpl
- create(Reader) - Method in class org.apache.lucene.analysis.icu.ICUNormalizer2CharFilterFactory
- create(TokenStream) - Method in class org.apache.lucene.analysis.icu.ICUFoldingFilterFactory
- create(TokenStream) - Method in class org.apache.lucene.analysis.icu.ICUNormalizer2FilterFactory
- create(TokenStream) - Method in class org.apache.lucene.analysis.icu.ICUTransformFilterFactory
- create(AttributeFactory) - Method in class org.apache.lucene.analysis.icu.segmentation.ICUTokenizerFactory
- createComponents(String) - Method in class org.apache.lucene.analysis.icu.ICUCollationKeyAnalyzer
- createInstance() - Method in class org.apache.lucene.analysis.icu.ICUCollationAttributeFactory
D
- Danish Sorting - Search tag in Overview
- Section
- DefaultICUTokenizerConfig - Class in org.apache.lucene.analysis.icu.segmentation
-
Default
ICUTokenizerConfig
that is generally applicable to many languages. - DefaultICUTokenizerConfig(boolean, boolean) - Constructor for class org.apache.lucene.analysis.icu.segmentation.DefaultICUTokenizerConfig
-
Creates a new config.
E
- EMOJI_SEQUENCE_STATUS - Static variable in class org.apache.lucene.analysis.icu.segmentation.ICUTokenizerConfig
-
Rule status for emoji sequences
- end() - Method in class org.apache.lucene.analysis.icu.segmentation.ICUTokenizer
- equals(Object) - Method in class org.apache.lucene.analysis.icu.tokenattributes.ScriptAttributeImpl
- Example Usages - Search tag in Overview
- Section
- Example Usages - Search tag in Overview
- Section
- Example Usages - Search tag in Overview
- Section
- Example Usages - Search tag in Overview
- Section
- Example Usages - Search tag in Overview
- Section
- Example Usages - Search tag in Overview
- Section
- Example Usages - Search tag in Overview
- Section
F
- Farsi Range Queries - Search tag in Overview
- Section
G
- getBreakIterator(int) - Method in class org.apache.lucene.analysis.icu.segmentation.DefaultICUTokenizerConfig
- getBreakIterator(int) - Method in class org.apache.lucene.analysis.icu.segmentation.ICUTokenizerConfig
-
Return a breakiterator capable of processing a given script.
- getBytesRef() - Method in class org.apache.lucene.analysis.icu.tokenattributes.ICUCollatedTermAttributeImpl
- getCode() - Method in interface org.apache.lucene.analysis.icu.tokenattributes.ScriptAttribute
-
Get the numeric code for this script value.
- getCode() - Method in class org.apache.lucene.analysis.icu.tokenattributes.ScriptAttributeImpl
- getName() - Method in interface org.apache.lucene.analysis.icu.tokenattributes.ScriptAttribute
-
Get the full name.
- getName() - Method in class org.apache.lucene.analysis.icu.tokenattributes.ScriptAttributeImpl
- getShortName() - Method in interface org.apache.lucene.analysis.icu.tokenattributes.ScriptAttribute
-
Get the abbreviated name.
- getShortName() - Method in class org.apache.lucene.analysis.icu.tokenattributes.ScriptAttributeImpl
- getType(int, int) - Method in class org.apache.lucene.analysis.icu.segmentation.DefaultICUTokenizerConfig
- getType(int, int) - Method in class org.apache.lucene.analysis.icu.segmentation.ICUTokenizerConfig
-
Return a token type value for a given script and BreakIterator rule status.
H
- hashCode() - Method in class org.apache.lucene.analysis.icu.tokenattributes.ScriptAttributeImpl
I
- ICUCollatedTermAttributeImpl - Class in org.apache.lucene.analysis.icu.tokenattributes
-
Extension of
CharTermAttributeImpl
that encodes the term text as a binary Unicode collation key instead of as UTF-8 bytes. - ICUCollatedTermAttributeImpl(Collator) - Constructor for class org.apache.lucene.analysis.icu.tokenattributes.ICUCollatedTermAttributeImpl
-
Create a new ICUCollatedTermAttributeImpl
- ICUCollationAttributeFactory - Class in org.apache.lucene.analysis.icu
-
Converts each token into its
CollationKey
, and then encodes bytes as an index term. - ICUCollationAttributeFactory(Collator) - Constructor for class org.apache.lucene.analysis.icu.ICUCollationAttributeFactory
-
Create an ICUCollationAttributeFactory, using
TokenStream.DEFAULT_TOKEN_ATTRIBUTE_FACTORY
as the factory for all other attributes. - ICUCollationAttributeFactory(AttributeFactory, Collator) - Constructor for class org.apache.lucene.analysis.icu.ICUCollationAttributeFactory
-
Create an ICUCollationAttributeFactory, using the supplied Attribute Factory as the factory for all other attributes.
- ICUCollationDocValuesField - Class in org.apache.lucene.analysis.icu
-
Indexes collation keys as a single-valued
SortedDocValuesField
. - ICUCollationDocValuesField(String, Collator) - Constructor for class org.apache.lucene.analysis.icu.ICUCollationDocValuesField
-
Create a new ICUCollationDocValuesField.
- ICUCollationKeyAnalyzer - Class in org.apache.lucene.analysis.icu
-
Configures
KeywordTokenizer
withICUCollationAttributeFactory
. - ICUCollationKeyAnalyzer(Collator) - Constructor for class org.apache.lucene.analysis.icu.ICUCollationKeyAnalyzer
-
Create a new ICUCollationKeyAnalyzer, using the specified collator.
- ICUFoldingFilter - Class in org.apache.lucene.analysis.icu
-
A TokenFilter that applies search term folding to Unicode text, applying foldings from UTR#30 Character Foldings.
- ICUFoldingFilter(TokenStream) - Constructor for class org.apache.lucene.analysis.icu.ICUFoldingFilter
-
Create a new ICUFoldingFilter on the specified input
- ICUFoldingFilter(TokenStream, Normalizer2) - Constructor for class org.apache.lucene.analysis.icu.ICUFoldingFilter
-
Create a new ICUFoldingFilter on the specified input with the specified normalizer
- ICUFoldingFilterFactory - Class in org.apache.lucene.analysis.icu
-
Factory for
ICUFoldingFilter
. - ICUFoldingFilterFactory() - Constructor for class org.apache.lucene.analysis.icu.ICUFoldingFilterFactory
-
Default ctor for compatibility with SPI
- ICUFoldingFilterFactory(Map<String, String>) - Constructor for class org.apache.lucene.analysis.icu.ICUFoldingFilterFactory
-
Creates a new ICUFoldingFilterFactory
- ICUNormalizer2CharFilter - Class in org.apache.lucene.analysis.icu
-
Normalize token text with ICU's
Normalizer2
. - ICUNormalizer2CharFilter(Reader) - Constructor for class org.apache.lucene.analysis.icu.ICUNormalizer2CharFilter
-
Create a new Normalizer2CharFilter that combines NFKC normalization, Case Folding, and removes Default Ignorables (NFKC_Casefold)
- ICUNormalizer2CharFilter(Reader, Normalizer2) - Constructor for class org.apache.lucene.analysis.icu.ICUNormalizer2CharFilter
-
Create a new Normalizer2CharFilter with the specified Normalizer2
- ICUNormalizer2CharFilterFactory - Class in org.apache.lucene.analysis.icu
-
Factory for
ICUNormalizer2CharFilter
- ICUNormalizer2CharFilterFactory() - Constructor for class org.apache.lucene.analysis.icu.ICUNormalizer2CharFilterFactory
-
Default ctor for compatibility with SPI
- ICUNormalizer2CharFilterFactory(Map<String, String>) - Constructor for class org.apache.lucene.analysis.icu.ICUNormalizer2CharFilterFactory
-
Creates a new ICUNormalizer2CharFilterFactory
- ICUNormalizer2Filter - Class in org.apache.lucene.analysis.icu
-
Normalize token text with ICU's
Normalizer2
- ICUNormalizer2Filter(TokenStream) - Constructor for class org.apache.lucene.analysis.icu.ICUNormalizer2Filter
-
Create a new Normalizer2Filter that combines NFKC normalization, Case Folding, and removes Default Ignorables (NFKC_Casefold)
- ICUNormalizer2Filter(TokenStream, Normalizer2) - Constructor for class org.apache.lucene.analysis.icu.ICUNormalizer2Filter
-
Create a new Normalizer2Filter with the specified Normalizer2
- ICUNormalizer2FilterFactory - Class in org.apache.lucene.analysis.icu
-
Factory for
ICUNormalizer2Filter
- ICUNormalizer2FilterFactory() - Constructor for class org.apache.lucene.analysis.icu.ICUNormalizer2FilterFactory
-
Default ctor for compatibility with SPI
- ICUNormalizer2FilterFactory(Map<String, String>) - Constructor for class org.apache.lucene.analysis.icu.ICUNormalizer2FilterFactory
-
Creates a new ICUNormalizer2FilterFactory
- ICUTokenizer - Class in org.apache.lucene.analysis.icu.segmentation
-
Breaks text into words according to UAX #29: Unicode Text Segmentation (http://www.unicode.org/reports/tr29/)
- ICUTokenizer() - Constructor for class org.apache.lucene.analysis.icu.segmentation.ICUTokenizer
-
Construct a new ICUTokenizer that breaks text into words from the given Reader.
- ICUTokenizer(ICUTokenizerConfig) - Constructor for class org.apache.lucene.analysis.icu.segmentation.ICUTokenizer
-
Construct a new ICUTokenizer that breaks text into words from the given Reader, using a tailored BreakIterator configuration.
- ICUTokenizer(AttributeFactory, ICUTokenizerConfig) - Constructor for class org.apache.lucene.analysis.icu.segmentation.ICUTokenizer
-
Construct a new ICUTokenizer that breaks text into words from the given Reader, using a tailored BreakIterator configuration.
- ICUTokenizerConfig - Class in org.apache.lucene.analysis.icu.segmentation
-
Class that allows for tailored Unicode Text Segmentation on a per-writing system basis.
- ICUTokenizerConfig() - Constructor for class org.apache.lucene.analysis.icu.segmentation.ICUTokenizerConfig
-
Sole constructor.
- ICUTokenizerFactory - Class in org.apache.lucene.analysis.icu.segmentation
-
Factory for
ICUTokenizer
. - ICUTokenizerFactory() - Constructor for class org.apache.lucene.analysis.icu.segmentation.ICUTokenizerFactory
-
Default ctor for compatibility with SPI
- ICUTokenizerFactory(Map<String, String>) - Constructor for class org.apache.lucene.analysis.icu.segmentation.ICUTokenizerFactory
-
Creates a new ICUTokenizerFactory
- ICUTransformFilter - Class in org.apache.lucene.analysis.icu
-
A
TokenFilter
that transforms text with ICU. - ICUTransformFilter(TokenStream, Transliterator) - Constructor for class org.apache.lucene.analysis.icu.ICUTransformFilter
-
Create a new ICUTransformFilter that transforms text on the given stream.
- ICUTransformFilterFactory - Class in org.apache.lucene.analysis.icu
-
Factory for
ICUTransformFilter
. - ICUTransformFilterFactory() - Constructor for class org.apache.lucene.analysis.icu.ICUTransformFilterFactory
-
Default ctor for compatibility with SPI
- ICUTransformFilterFactory(Map<String, String>) - Constructor for class org.apache.lucene.analysis.icu.ICUTransformFilterFactory
-
Creates a new ICUTransformFilterFactory
- incrementToken() - Method in class org.apache.lucene.analysis.icu.ICUNormalizer2Filter
- incrementToken() - Method in class org.apache.lucene.analysis.icu.ICUTransformFilter
- incrementToken() - Method in class org.apache.lucene.analysis.icu.segmentation.ICUTokenizer
- inform(ResourceLoader) - Method in class org.apache.lucene.analysis.icu.segmentation.ICUTokenizerFactory
L
- Lowercasing text - Search tag in Overview
- Section
N
- name() - Method in class org.apache.lucene.analysis.icu.ICUCollationDocValuesField
- NAME - Static variable in class org.apache.lucene.analysis.icu.ICUFoldingFilterFactory
-
SPI name
- NAME - Static variable in class org.apache.lucene.analysis.icu.ICUNormalizer2CharFilterFactory
-
SPI name
- NAME - Static variable in class org.apache.lucene.analysis.icu.ICUNormalizer2FilterFactory
-
SPI name
- NAME - Static variable in class org.apache.lucene.analysis.icu.ICUTransformFilterFactory
-
SPI name
- NAME - Static variable in class org.apache.lucene.analysis.icu.segmentation.ICUTokenizerFactory
-
SPI name
- Normalization - Search tag in Overview
- Section
- normalize(Reader) - Method in class org.apache.lucene.analysis.icu.ICUNormalizer2CharFilterFactory
- normalize(TokenStream) - Method in class org.apache.lucene.analysis.icu.ICUFoldingFilterFactory
- normalize(TokenStream) - Method in class org.apache.lucene.analysis.icu.ICUNormalizer2FilterFactory
- normalize(TokenStream) - Method in class org.apache.lucene.analysis.icu.ICUTransformFilterFactory
- NORMALIZER - Static variable in class org.apache.lucene.analysis.icu.ICUFoldingFilter
-
A normalizer for search term folding to Unicode text, applying foldings from UTR#30 Character Foldings.
- Normalizing text to NFC - Search tag in Overview
- Section
O
- org.apache.lucene.analysis.icu - package org.apache.lucene.analysis.icu
-
Analysis components based on ICU
- org.apache.lucene.analysis.icu.segmentation - package org.apache.lucene.analysis.icu.segmentation
-
Tokenizer that breaks text into words with the Unicode Text Segmentation algorithm.
- org.apache.lucene.analysis.icu.tokenattributes - package org.apache.lucene.analysis.icu.tokenattributes
-
Additional ICU-specific Attributes for text analysis.
R
- read(char[], int, int) - Method in class org.apache.lucene.analysis.icu.ICUNormalizer2CharFilter
- reflectWith(AttributeReflector) - Method in class org.apache.lucene.analysis.icu.tokenattributes.ScriptAttributeImpl
- Removing accents - Search tag in Overview
- Section
- reset() - Method in class org.apache.lucene.analysis.icu.segmentation.ICUTokenizer
- Restricting normalization to Unicode 5.0 - Search tag in Overview
- Section
S
- ScriptAttribute - Interface in org.apache.lucene.analysis.icu.tokenattributes
-
This attribute stores the UTR #24 script value for a token of text.
- ScriptAttributeImpl - Class in org.apache.lucene.analysis.icu.tokenattributes
-
Implementation of
ScriptAttribute
that stores the script as an integer. - ScriptAttributeImpl() - Constructor for class org.apache.lucene.analysis.icu.tokenattributes.ScriptAttributeImpl
-
Initializes this attribute with
UScript.COMMON
- Search Term Folding - Search tag in Overview
- Section
- setCode(int) - Method in interface org.apache.lucene.analysis.icu.tokenattributes.ScriptAttribute
-
Set the numeric code for this script value.
- setCode(int) - Method in class org.apache.lucene.analysis.icu.tokenattributes.ScriptAttributeImpl
- setStringValue(String) - Method in class org.apache.lucene.analysis.icu.ICUCollationDocValuesField
T
- Text Segmentation - Search tag in Overview
- Section
- Text Transformation - Search tag in Overview
- Section
- Tokenizing multilanguage text - Search tag in Overview
- Section
- Transliterate Serbian Cyrillic to Serbian Latin - Search tag in Overview
- Section
- Turkish Case Normalization - Search tag in Overview
- Section
U
- Use Cases - Search tag in Overview
- Section
- Use Cases - Search tag in Overview
- Section
- Use Cases - Search tag in Overview
- Section
- Use Cases - Search tag in Overview
- Section
- Use Cases - Search tag in Overview
- Section
- Use Cases - Search tag in Overview
- Section
W
- WORD_EMOJI - Static variable in class org.apache.lucene.analysis.icu.segmentation.DefaultICUTokenizerConfig
-
Token type for words that appear to be emoji sequences
- WORD_HANGUL - Static variable in class org.apache.lucene.analysis.icu.segmentation.DefaultICUTokenizerConfig
-
Token type for words containing Korean hangul
- WORD_HIRAGANA - Static variable in class org.apache.lucene.analysis.icu.segmentation.DefaultICUTokenizerConfig
-
Token type for words containing Japanese hiragana
- WORD_IDEO - Static variable in class org.apache.lucene.analysis.icu.segmentation.DefaultICUTokenizerConfig
-
Token type for words containing ideographic characters
- WORD_KATAKANA - Static variable in class org.apache.lucene.analysis.icu.segmentation.DefaultICUTokenizerConfig
-
Token type for words containing Japanese katakana
- WORD_LETTER - Static variable in class org.apache.lucene.analysis.icu.segmentation.DefaultICUTokenizerConfig
-
Token type for words that contain letters
- WORD_NUMBER - Static variable in class org.apache.lucene.analysis.icu.segmentation.DefaultICUTokenizerConfig
-
Token type for words that appear to be numbers
All Classes and Interfaces|All Packages|Constant Field Values