public final class UnicodeWhitespaceTokenizer extends CharTokenizer
For Unicode version see: UnicodeProps
AttributeSource.State
DEFAULT_TOKEN_ATTRIBUTE_FACTORY
Constructor and Description |
---|
UnicodeWhitespaceTokenizer()
Construct a new UnicodeWhitespaceTokenizer.
|
UnicodeWhitespaceTokenizer(AttributeFactory factory)
Construct a new UnicodeWhitespaceTokenizer using a given
AttributeFactory . |
Modifier and Type | Method and Description |
---|---|
protected boolean |
isTokenChar(int c)
Collects only characters which do not satisfy Unicode's WHITESPACE property.
|
end, fromSeparatorCharPredicate, fromSeparatorCharPredicate, fromSeparatorCharPredicate, fromSeparatorCharPredicate, fromTokenCharPredicate, fromTokenCharPredicate, fromTokenCharPredicate, fromTokenCharPredicate, incrementToken, normalize, reset
close, correctOffset, setReader
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toString
public UnicodeWhitespaceTokenizer()
public UnicodeWhitespaceTokenizer(AttributeFactory factory)
AttributeFactory
.factory
- the attribute factory to use for this Tokenizer
protected boolean isTokenChar(int c)
isTokenChar
in class CharTokenizer
Copyright © 2000-2017 Apache Software Foundation. All Rights Reserved.