Interface TermToBytesRefAttribute

All Superinterfaces:
Attribute
All Known Subinterfaces:
BytesTermAttribute
All Known Implementing Classes:
BytesTermAttributeImpl, CharTermAttributeImpl, PackedTokenAttributeImpl

public interface TermToBytesRefAttribute extends Attribute
This attribute is requested by TermsHashPerField to index the contents. This attribute can be used to customize the final byte[] encoding of terms.

Consumers of this attribute call getBytesRef() for each term. Example:

   final TermToBytesRefAttribute termAtt = tokenStream.getAttribute(TermToBytesRefAttribute.class);

   while (tokenStream.incrementToken() {
     final BytesRef bytes = termAtt.getBytesRef();

     if (isInteresting(bytes)) {

       // because the bytes are reused by the attribute (like CharTermAttribute's char[] buffer),
       // you should make a copy if you need persistent access to the bytes, otherwise they will
       // be rewritten across calls to incrementToken()

       doSomethingWith(BytesRef.deepCopyOf(bytes));
     }
   }
   ...
 
NOTE: This API is for internal purposes only and might change in incompatible ways in the next release.
This is a very expert and internal API, please use CharTermAttribute and its implementation for UTF-8 terms; to index binary terms, use BytesTermAttribute and its implementation.
  • Method Summary

    Modifier and Type
    Method
    Description
    Retrieve this attribute's BytesRef.
  • Method Details

    • getBytesRef

      BytesRef getBytesRef()
      Retrieve this attribute's BytesRef. The bytes are updated from the current term. The implementation may return a new instance or keep the previous one.
      Returns:
      a BytesRef to be indexed (only stays valid until token stream gets incremented)