Class BytesRef

java.lang.Object
org.apache.lucene.util.BytesRef
All Implemented Interfaces:
Cloneable, Comparable<BytesRef>

public final class BytesRef extends Object implements Comparable<BytesRef>, Cloneable
Represents byte[], as a slice (offset + length) into an existing byte[]. The bytes member should never be null; use EMPTY_BYTES if necessary.

Important note: Unless otherwise noted, Lucene uses this class to represent terms that are encoded as UTF8 bytes in the index. To convert them to a Java String (which is UTF16), use utf8ToString(). Using code like new String(bytes, offset, length) to do this is wrong, as it does not respect the correct character set and may return wrong results (depending on the platform's defaults)!

BytesRef implements Comparable. The underlying byte arrays are sorted lexicographically, numerically treating elements as unsigned. This is identical to Unicode codepoint order.

  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    byte[]
    The contents of the BytesRef.
    static final byte[]
    An empty byte array for convenience
    int
    Length of used bytes.
    int
    Offset of first valid byte.
  • Constructor Summary

    Constructors
    Constructor
    Description
    Create a BytesRef with EMPTY_BYTES
    BytesRef(byte[] bytes)
    This instance will directly reference bytes w/o making a copy.
    BytesRef(byte[] bytes, int offset, int length)
    This instance will directly reference bytes w/o making a copy.
    BytesRef(int capacity)
    Create a BytesRef pointing to a new array of size capacity.
    Initialize the byte[] from the UTF8 bytes for the provided String.
  • Method Summary

    Modifier and Type
    Method
    Description
    boolean
    Expert: compares the bytes against another BytesRef, returning true if the bytes are equal.
    Returns a shallow clone of this instance (the underlying bytes are not copied and will be shared by both the returned object and this object.
    int
    Unsigned byte order comparison
    static BytesRef
    Creates a new BytesRef that points to a copy of the bytes from other
    boolean
    equals(Object other)
     
    int
    Calculates the hash code as required by TermsHash during indexing.
    boolean
    Performs internal consistency checks.
    Returns hex encoded bytes, eg [0x6c 0x75 0x63 0x65 0x6e 0x65]
    Interprets stored bytes as UTF8 bytes, returning the resulting string

    Methods inherited from class java.lang.Object

    finalize, getClass, notify, notifyAll, wait, wait, wait
  • Field Details

    • EMPTY_BYTES

      public static final byte[] EMPTY_BYTES
      An empty byte array for convenience
    • bytes

      public byte[] bytes
      The contents of the BytesRef. Should never be null.
    • offset

      public int offset
      Offset of first valid byte.
    • length

      public int length
      Length of used bytes.
  • Constructor Details

    • BytesRef

      public BytesRef()
      Create a BytesRef with EMPTY_BYTES
    • BytesRef

      public BytesRef(byte[] bytes, int offset, int length)
      This instance will directly reference bytes w/o making a copy. bytes should not be null.
    • BytesRef

      public BytesRef(byte[] bytes)
      This instance will directly reference bytes w/o making a copy. bytes should not be null
    • BytesRef

      public BytesRef(int capacity)
      Create a BytesRef pointing to a new array of size capacity. Offset and length will both be zero.
    • BytesRef

      public BytesRef(CharSequence text)
      Initialize the byte[] from the UTF8 bytes for the provided String.
      Parameters:
      text - This must be well-formed unicode text, with no unpaired surrogates.
  • Method Details

    • bytesEquals

      public boolean bytesEquals(BytesRef other)
      Expert: compares the bytes against another BytesRef, returning true if the bytes are equal.
      Parameters:
      other - Another BytesRef, should not be null.
      NOTE: This API is for internal purposes only and might change in incompatible ways in the next release.
    • clone

      public BytesRef clone()
      Returns a shallow clone of this instance (the underlying bytes are not copied and will be shared by both the returned object and this object.
      Overrides:
      clone in class Object
      See Also:
    • hashCode

      public int hashCode()
      Calculates the hash code as required by TermsHash during indexing.

      This is currently implemented as MurmurHash3 (32 bit), using the seed from StringHelper.GOOD_FAST_HASH_SEED, but is subject to change from release to release.

      Overrides:
      hashCode in class Object
    • equals

      public boolean equals(Object other)
      Overrides:
      equals in class Object
    • utf8ToString

      public String utf8ToString()
      Interprets stored bytes as UTF8 bytes, returning the resulting string
    • toString

      public String toString()
      Returns hex encoded bytes, eg [0x6c 0x75 0x63 0x65 0x6e 0x65]
      Overrides:
      toString in class Object
    • compareTo

      public int compareTo(BytesRef other)
      Unsigned byte order comparison
      Specified by:
      compareTo in interface Comparable<BytesRef>
    • deepCopyOf

      public static BytesRef deepCopyOf(BytesRef other)
      Creates a new BytesRef that points to a copy of the bytes from other

      The returned BytesRef will have a length of other.length and an offset of zero.

    • isValid

      public boolean isValid()
      Performs internal consistency checks. Always returns true (or throws IllegalStateException)