public final class WhitespaceTokenizer extends CharTokenizer
You must specify the required Version
compatibility when creating
WhitespaceTokenizer
:
CharTokenizer
uses an int based API to normalize and
detect token characters. See CharTokenizer.isTokenChar(int)
and
CharTokenizer.normalize(int)
for details.AttributeSource.AttributeFactory, AttributeSource.State
Constructor and Description |
---|
WhitespaceTokenizer(Version matchVersion,
AttributeSource.AttributeFactory factory,
Reader in)
Construct a new WhitespaceTokenizer using a given
AttributeSource.AttributeFactory . |
WhitespaceTokenizer(Version matchVersion,
AttributeSource source,
Reader in)
Construct a new WhitespaceTokenizer using a given
AttributeSource . |
WhitespaceTokenizer(Version matchVersion,
Reader in)
Construct a new WhitespaceTokenizer.
|
Modifier and Type | Method and Description |
---|---|
protected boolean |
isTokenChar(int c)
Collects only characters which do not satisfy
Character.isWhitespace(int) . |
end, incrementToken, normalize, reset
close, correctOffset, setReader
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState
public WhitespaceTokenizer(Version matchVersion, Reader in)
in
- the input to split up into tokenspublic WhitespaceTokenizer(Version matchVersion, AttributeSource source, Reader in)
AttributeSource
.public WhitespaceTokenizer(Version matchVersion, AttributeSource.AttributeFactory factory, Reader in)
AttributeSource.AttributeFactory
.protected boolean isTokenChar(int c)
Character.isWhitespace(int)
.isTokenChar
in class CharTokenizer
Copyright © 2000-2013 Apache Software Foundation. All Rights Reserved.