public final class TokenStreamFromTermVector extends TokenStream
reset(), so there's
no need to wrap with a caching impl.
The implementation will create an array of tokens indexed by token position. As long as there aren't massive jumps in positions, this is fine. And it assumes there aren't large numbers of tokens at the same position, since it adds them to a linked-list per position in O(N^2) complexity. When there aren't positions in the term vector, it divides the startOffset by 8 to use as a temporary substitute. In that case, tokens with the same startOffset will occupy the same final position; otherwise tokens become adjacent.
AttributeSource.State| Modifier and Type | Field and Description |
|---|---|
static AttributeFactory |
ATTRIBUTE_FACTORY |
DEFAULT_TOKEN_ATTRIBUTE_FACTORY| Constructor and Description |
|---|
TokenStreamFromTermVector(Terms vector,
int maxStartOffset)
Constructor.
|
| Modifier and Type | Method and Description |
|---|---|
Terms |
getTermVectorTerms() |
boolean |
incrementToken() |
void |
reset() |
close, endaddAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toStringpublic static final AttributeFactory ATTRIBUTE_FACTORY
public TokenStreamFromTermVector(Terms vector, int maxStartOffset) throws IOException
incrementToken.vector - Terms that contains the data for
creating the TokenStream. Must have positions and/or offsets.maxStartOffset - if a token's start offset exceeds this then the token is not added. -1 disables the limit.IOExceptionpublic Terms getTermVectorTerms()
public void reset()
throws IOException
reset in class TokenStreamIOExceptionpublic boolean incrementToken()
throws IOException
incrementToken in class TokenStreamIOExceptionCopyright © 2000-2016 Apache Software Foundation. All Rights Reserved.