public class ICUNormalizer2Filter
extends org.apache.lucene.analysis.TokenFilter
Normalizer2
With this filter, you can normalize text in the following ways:
If you use the defaults, this filter is a simple way to standardize Unicode text in a language-independent way for search:
Normalizer2
,
FilteredNormalizer2
Constructor and Description |
---|
ICUNormalizer2Filter(org.apache.lucene.analysis.TokenStream input)
Create a new Normalizer2Filter that combines NFKC normalization, Case
Folding, and removes Default Ignorables (NFKC_Casefold)
|
ICUNormalizer2Filter(org.apache.lucene.analysis.TokenStream input,
com.ibm.icu.text.Normalizer2 normalizer)
Create a new Normalizer2Filter with the specified Normalizer2
|
Modifier and Type | Method and Description |
---|---|
boolean |
incrementToken() |
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toString
public ICUNormalizer2Filter(org.apache.lucene.analysis.TokenStream input)
public ICUNormalizer2Filter(org.apache.lucene.analysis.TokenStream input, com.ibm.icu.text.Normalizer2 normalizer)
input
- streamnormalizer
- normalizer to usepublic final boolean incrementToken() throws IOException
incrementToken
in class org.apache.lucene.analysis.TokenStream
IOException