public final class PatternTokenizer extends Tokenizer
group=-1 (the default) is equivalent to "split". In this case, the tokens will
be equivalent to the output from (without empty tokens):
String.split(java.lang.String)
Using group >= 0 selects the matching group as the token. For example, if you have:
pattern = \'([^\']+)\' group = 0 input = aaa 'bbb' 'ccc'the output will be two tokens: 'bbb' and 'ccc' (including the ' marks). With the same input but using group=1, the output would be: bbb and ccc (no ' marks)
NOTE: This Tokenizer does not output tokens that are of zero length.
PatternAttributeSource.StateDEFAULT_TOKEN_ATTRIBUTE_FACTORY| Constructor and Description |
|---|
PatternTokenizer(AttributeFactory factory,
Pattern pattern,
int group)
creates a new PatternTokenizer returning tokens from group (-1 for split functionality)
|
PatternTokenizer(Pattern pattern,
int group)
creates a new PatternTokenizer returning tokens from group (-1 for split functionality)
|
| Modifier and Type | Method and Description |
|---|---|
void |
close() |
void |
end() |
boolean |
incrementToken() |
void |
reset() |
correctOffset, setReaderaddAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toStringpublic PatternTokenizer(Pattern pattern, int group)
public PatternTokenizer(AttributeFactory factory, Pattern pattern, int group)
public boolean incrementToken()
incrementToken in class TokenStreampublic void end()
throws IOException
end in class TokenStreamIOExceptionpublic void close()
throws IOException
close in interface Closeableclose in interface AutoCloseableclose in class TokenizerIOExceptionpublic void reset()
throws IOException
reset in class TokenizerIOExceptionCopyright © 2000-2020 Apache Software Foundation. All Rights Reserved.