public final class PatternTokenizer extends Tokenizer
group=-1 (the default) is equivalent to "split". In this case, the tokens will
be equivalent to the output from (without empty tokens):
String.split(java.lang.String)
Using group >= 0 selects the matching group as the token. For example, if you have:
pattern = \'([^\']+)\' group = 0 input = aaa 'bbb' 'ccc'the output will be two tokens: 'bbb' and 'ccc' (including the ' marks). With the same input but using group=1, the output would be: bbb and ccc (no ' marks)
NOTE: This Tokenizer does not output tokens that are of zero length.
PatternAttributeSource.AttributeFactory, AttributeSource.State| Constructor and Description |
|---|
PatternTokenizer(Reader input,
Pattern pattern,
int group)
creates a new PatternTokenizer returning tokens from group (-1 for split functionality)
|
| Modifier and Type | Method and Description |
|---|---|
void |
end() |
boolean |
incrementToken() |
void |
reset() |
close, correctOffset, setReaderaddAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreStatepublic PatternTokenizer(Reader input, Pattern pattern, int group) throws IOException
IOExceptionpublic boolean incrementToken()
incrementToken in class TokenStreampublic void end()
end in class TokenStreampublic void reset()
throws IOException
reset in class TokenStreamIOExceptionCopyright © 2000-2013 Apache Software Foundation. All Rights Reserved.