org.apache.lucene.analysis.miscellaneous
Class CodepointCountFilter
java.lang.Object
org.apache.lucene.util.AttributeSource
org.apache.lucene.analysis.TokenStream
org.apache.lucene.analysis.TokenFilter
org.apache.lucene.analysis.util.FilteringTokenFilter
org.apache.lucene.analysis.miscellaneous.CodepointCountFilter
- All Implemented Interfaces:
- Closeable
public final class CodepointCountFilter
- extends FilteringTokenFilter
Removes words that are too long or too short from the stream.
Note: Length is calculated as the number of Unicode codepoints.
Methods inherited from class org.apache.lucene.util.AttributeSource |
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toString |
CodepointCountFilter
public CodepointCountFilter(Version version,
TokenStream in,
int min,
int max)
- Create a new
CodepointCountFilter
. This will filter out tokens whose
CharTermAttribute
is either too short (Character.codePointCount(char[], int, int)
< min) or too long (Character.codePointCount(char[], int, int)
> max).
- Parameters:
version
- the Lucene match versionin
- the TokenStream
to consumemin
- the minimum lengthmax
- the maximum length
accept
public boolean accept()
- Description copied from class:
FilteringTokenFilter
- Override this method and return if the current input token should be returned by
FilteringTokenFilter.incrementToken()
.
- Specified by:
accept
in class FilteringTokenFilter
Copyright © 2000-2014 Apache Software Foundation. All Rights Reserved.