org.apache.lucene.analysis.uima
Class BaseUIMATokenizer
java.lang.Object
org.apache.lucene.util.AttributeSource
org.apache.lucene.analysis.TokenStream
org.apache.lucene.analysis.Tokenizer
org.apache.lucene.analysis.uima.BaseUIMATokenizer
- All Implemented Interfaces:
- Closeable
- Direct Known Subclasses:
- UIMAAnnotationsTokenizer, UIMATypeAwareAnnotationsTokenizer
public abstract class BaseUIMATokenizer
- extends Tokenizer
Abstract base implementation of a Tokenizer
which is able to analyze the given input with a
UIMA AnalysisEngine
Field Summary |
protected org.apache.uima.analysis_engine.AnalysisEngine |
ae
|
protected org.apache.uima.cas.CAS |
cas
|
protected org.apache.uima.cas.FSIterator<org.apache.uima.cas.text.AnnotationFS> |
iterator
|
Fields inherited from class org.apache.lucene.analysis.Tokenizer |
input |
Method Summary |
protected void |
analyzeInput()
analyzes the tokenizer input using the given analysis engine
cas will be filled with extracted metadata (UIMA annotations, feature structures) |
void |
end()
|
protected abstract void |
initializeIterator()
initialize the FSIterator which is used to build tokens at each incrementToken() method call |
void |
reset()
|
Methods inherited from class org.apache.lucene.util.AttributeSource |
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState |
iterator
protected org.apache.uima.cas.FSIterator<org.apache.uima.cas.text.AnnotationFS> iterator
ae
protected org.apache.uima.analysis_engine.AnalysisEngine ae
cas
protected org.apache.uima.cas.CAS cas
BaseUIMATokenizer
protected BaseUIMATokenizer(Reader reader,
String descriptorPath,
Map<String,Object> configurationParameters)
analyzeInput
protected void analyzeInput()
throws org.apache.uima.resource.ResourceInitializationException,
org.apache.uima.analysis_engine.AnalysisEngineProcessException,
IOException
- analyzes the tokenizer input using the given analysis engine
cas
will be filled with extracted metadata (UIMA annotations, feature structures)
- Throws:
IOException
- If there is a low-level I/O error.
org.apache.uima.resource.ResourceInitializationException
org.apache.uima.analysis_engine.AnalysisEngineProcessException
initializeIterator
protected abstract void initializeIterator()
throws IOException
- initialize the FSIterator which is used to build tokens at each incrementToken() method call
- Throws:
IOException
- If there is a low-level I/O error.
reset
public void reset()
throws IOException
- Overrides:
reset
in class TokenStream
- Throws:
IOException
end
public void end()
throws IOException
- Overrides:
end
in class TokenStream
- Throws:
IOException
Copyright © 2000-2013 Apache Software Foundation. All Rights Reserved.