public interface TermToBytesRefAttribute extends Attribute
Consumers of this attribute call getBytesRef()
up-front, and then
invoke fillBytesRef()
for each term. Example:
final TermToBytesRefAttribute termAtt = tokenStream.getAttribute(TermToBytesRefAttribute.class); final BytesRef bytes = termAtt.getBytesRef(); while (tokenStream.incrementToken() { // you must call termAtt.fillBytesRef() before doing something with the bytes. // this encodes the term value (internally it might be a char[], etc) into the bytes. int hashCode = termAtt.fillBytesRef(); if (isInteresting(bytes)) { // because the bytes are reused by the attribute (like CharTermAttribute's char[] buffer), // you should make a copy if you need persistent access to the bytes, otherwise they will // be rewritten across calls to incrementToken() doSomethingWith(new BytesRef(bytes)); } } ...
CharTermAttributeImpl
and its implementation of this method
for UTF-8 terms.Modifier and Type | Method and Description |
---|---|
int |
fillBytesRef()
Updates the bytes
getBytesRef() to contain this term's
final encoding, and returns its hashcode. |
BytesRef |
getBytesRef()
Retrieve this attribute's BytesRef.
|
int fillBytesRef()
getBytesRef()
to contain this term's
final encoding, and returns its hashcode.BytesRef.hashCode()
:
int hash = 0; for (int i = termBytes.offset; i < termBytes.offset+termBytes.length; i++) { hash = 31*hash + termBytes.bytes[i]; }Implement this for performance reasons, if your code can calculate the hash on-the-fly. If this is not the case, just return
termBytes.hashCode()
.BytesRef getBytesRef()
fillBytesRef()
.Copyright © 2000-2014 Apache Software Foundation. All Rights Reserved.