public class ICUNormalizer2Filter extends TokenFilter
Normalizer2
With this filter, you can normalize text in the following ways:
If you use the defaults, this filter is a simple way to standardize Unicode text in a language-independent way for search:
Normalizer2,
FilteredNormalizer2AttributeSource.StateinputDEFAULT_TOKEN_ATTRIBUTE_FACTORY| Constructor and Description |
|---|
ICUNormalizer2Filter(TokenStream input)
Create a new Normalizer2Filter that combines NFKC normalization, Case
Folding, and removes Default Ignorables (NFKC_Casefold)
|
ICUNormalizer2Filter(TokenStream input,
com.ibm.icu.text.Normalizer2 normalizer)
Create a new Normalizer2Filter with the specified Normalizer2
|
| Modifier and Type | Method and Description |
|---|---|
boolean |
incrementToken() |
close, end, resetaddAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toStringpublic ICUNormalizer2Filter(TokenStream input)
public ICUNormalizer2Filter(TokenStream input, com.ibm.icu.text.Normalizer2 normalizer)
input - streamnormalizer - normalizer to usepublic final boolean incrementToken()
throws IOException
incrementToken in class TokenStreamIOExceptionCopyright © 2000-2019 Apache Software Foundation. All Rights Reserved.