public abstract class BaseUIMATokenizer extends Tokenizer
Tokenizer
which is able to analyze the given input with a
UIMA AnalysisEngine
AttributeSource.State
Modifier and Type | Field and Description |
---|---|
protected org.apache.uima.analysis_engine.AnalysisEngine |
ae |
protected org.apache.uima.cas.CAS |
cas |
protected org.apache.uima.cas.FSIterator<org.apache.uima.cas.text.AnnotationFS> |
iterator |
DEFAULT_TOKEN_ATTRIBUTE_FACTORY
Modifier | Constructor and Description |
---|---|
protected |
BaseUIMATokenizer(AttributeFactory factory,
String descriptorPath,
Map<String,Object> configurationParameters) |
Modifier and Type | Method and Description |
---|---|
protected void |
analyzeInput()
analyzes the tokenizer input using the given analysis engine
|
protected abstract void |
initializeIterator()
initialize the FSIterator which is used to build tokens at each incrementToken() method call
|
void |
reset() |
close, correctOffset, setReader
end, incrementToken
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toString
protected org.apache.uima.cas.FSIterator<org.apache.uima.cas.text.AnnotationFS> iterator
protected org.apache.uima.analysis_engine.AnalysisEngine ae
protected org.apache.uima.cas.CAS cas
protected BaseUIMATokenizer(AttributeFactory factory, String descriptorPath, Map<String,Object> configurationParameters)
protected void analyzeInput() throws org.apache.uima.resource.ResourceInitializationException, org.apache.uima.analysis_engine.AnalysisEngineProcessException, IOException
cas
will be filled with extracted metadata (UIMA annotations, feature structures)
IOException
- If there is a low-level I/O error.org.apache.uima.resource.ResourceInitializationException
org.apache.uima.analysis_engine.AnalysisEngineProcessException
protected abstract void initializeIterator() throws IOException
IOException
- If there is a low-level I/O error.public void reset() throws IOException
reset
in class Tokenizer
IOException
Copyright © 2000-2016 Apache Software Foundation. All Rights Reserved.