public final class WhitespaceTokenizer extends CharTokenizer
Character.isWhitespace(int)
. Note: That definition explicitly excludes the non-breaking space.
Adjacent sequences of non-Whitespace characters form tokens.UnicodeWhitespaceTokenizer
AttributeSource.State
DEFAULT_TOKEN_ATTRIBUTE_FACTORY
Constructor and Description |
---|
WhitespaceTokenizer()
Construct a new WhitespaceTokenizer.
|
WhitespaceTokenizer(AttributeFactory factory)
Construct a new WhitespaceTokenizer using a given
AttributeFactory . |
Modifier and Type | Method and Description |
---|---|
protected boolean |
isTokenChar(int c)
Collects only characters which do not satisfy
Character.isWhitespace(int) . |
end, fromSeparatorCharPredicate, fromSeparatorCharPredicate, fromSeparatorCharPredicate, fromSeparatorCharPredicate, fromTokenCharPredicate, fromTokenCharPredicate, fromTokenCharPredicate, fromTokenCharPredicate, incrementToken, normalize, reset
close, correctOffset, setReader
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toString
public WhitespaceTokenizer()
public WhitespaceTokenizer(AttributeFactory factory)
AttributeFactory
.factory
- the attribute factory to use for this Tokenizer
protected boolean isTokenChar(int c)
Character.isWhitespace(int)
.isTokenChar
in class CharTokenizer
Copyright © 2000-2017 Apache Software Foundation. All Rights Reserved.