public final class TokenStreamFromTermVector extends TokenStream
reset()
, so there's
no need to wrap with a caching impl.
The implementation will create an array of tokens indexed by token position. As long as there aren't massive jumps in positions, this is fine. And it assumes there aren't large numbers of tokens at the same position, since it adds them to a linked-list per position in O(N^2) complexity. When there aren't positions in the term vector, it divides the startOffset by 8 to use as a temporary substitute. In that case, tokens with the same startOffset will occupy the same final position; otherwise tokens become adjacent.
AttributeSource.State
DEFAULT_TOKEN_ATTRIBUTE_FACTORY
Constructor and Description |
---|
TokenStreamFromTermVector(Terms vector,
int maxStartOffset)
Constructor.
|
Modifier and Type | Method and Description |
---|---|
Terms |
getTermVectorTerms() |
boolean |
incrementToken() |
void |
reset() |
close, end
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toString
public TokenStreamFromTermVector(Terms vector, int maxStartOffset) throws IOException
incrementToken
.vector
- Terms that contains the data for
creating the TokenStream. Must have positions and/or offsets.maxStartOffset
- if a token's start offset exceeds this then the token is not added. -1 disables the limit.IOException
public Terms getTermVectorTerms()
public void reset() throws IOException
reset
in class TokenStream
IOException
public boolean incrementToken() throws IOException
incrementToken
in class TokenStream
IOException
Copyright © 2000-2019 Apache Software Foundation. All Rights Reserved.