public final class BengaliNormalizationFilter extends TokenFilter
TokenFilter
that applies BengaliNormalizer
to normalize the
orthography.
In some cases the normalization may cause unrelated terms to conflate, so
to prevent terms from being normalized use an instance of
SetKeywordMarkerFilter
or a custom TokenFilter
that sets
the KeywordAttribute
before this TokenStream
.
BengaliNormalizer
AttributeSource.State
input
DEFAULT_TOKEN_ATTRIBUTE_FACTORY
Constructor and Description |
---|
BengaliNormalizationFilter(TokenStream input) |
Modifier and Type | Method and Description |
---|---|
boolean |
incrementToken() |
close, end, reset
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toString
public BengaliNormalizationFilter(TokenStream input)
public boolean incrementToken() throws IOException
incrementToken
in class TokenStream
IOException
Copyright © 2000-2019 Apache Software Foundation. All Rights Reserved.