NumericUtils (Lucene 4.2.0 API)

java.lang.Object
- org.apache.lucene.util.NumericUtils

```
public final class NumericUtils
extends Object
```
This is a helper class to generate prefix-encoded representations for numerical values and supplies converters to represent float/double values as sortable integers/longs.
To quickly execute range queries in Apache Lucene, a range is divided recursively into multiple intervals for searching: The center of the range is searched only with the lowest possible precision in the trie, while the boundaries are matched more exactly. This reduces the number of terms dramatically.
This class generates terms to achieve this: First the numerical integer values need to be converted to bytes. For that integer values (32 bit or 64 bit) are made unsigned and the bits are converted to ASCII chars with each 7 bit. The resulting byte[] is sortable like the original integer value (even using UTF-8 sort order). Each value is also prefixed (in the first char) by the shift value (number of bits removed) used during encoding.
To also index floating point numbers, this class supplies two methods to convert them to integer values by changing their bit layout: doubleToSortableLong(double), floatToSortableInt(float). You will have no precision loss by converting floating point numbers to integers and back (only that the integer form is not usable). Other data types like dates can easily converted to longs or ints (e.g. date to long: Date.getTime()).
For easy usage, the trie algorithm is implemented for indexing inside NumericTokenStream that can index int, long, float, and double. For querying, NumericRangeQuery and NumericRangeFilter implement the query part for the same data types.
This class can also be used, to generate lexicographically sortable (according to BytesRef.getUTF8SortedAsUTF16Comparator()) representations of numeric data types for other usages (e.g. sorting).

Since:

2.9, API changed non backwards-compliant in 4.0

NOTE: This API is for internal purposes only and might change in incompatible ways in the next release.

Nested Class Summary

Nested Classes
Modifier and Type	Class and Description
`static class`	`NumericUtils.IntRangeBuilder` Callback for `splitIntRange(org.apache.lucene.util.NumericUtils.IntRangeBuilder, int, int, int)`.
`static class`	`NumericUtils.LongRangeBuilder` Callback for `splitLongRange(org.apache.lucene.util.NumericUtils.LongRangeBuilder, int, long, long)`.

Field Summary

Fields
Modifier and Type	Field and Description
`static int`	`BUF_SIZE_INT` The maximum term length (used for `byte[]` buffer size) for encoding `int` values.
`static int`	`BUF_SIZE_LONG` The maximum term length (used for `byte[]` buffer size) for encoding `long` values.
`static int`	`PRECISION_STEP_DEFAULT` The default precision step used by `IntField`, `FloatField`, `LongField`, `DoubleField`, `NumericTokenStream`, `NumericRangeQuery`, and `NumericRangeFilter`.
`static byte`	`SHIFT_START_INT` Integers are stored at lower precision by shifting off lower bits.
`static byte`	`SHIFT_START_LONG` Longs are stored at lower precision by shifting off lower bits.

Method Summary

Methods
Modifier and Type	Method and Description
`static long`	`doubleToSortableLong(double val)` Converts a `double` value to a sortable signed `long`.
`static TermsEnum`	`filterPrefixCodedInts(TermsEnum termsEnum)` Filters the given `TermsEnum` by accepting only prefix coded 32 bit terms with a shift value of `0`.
`static TermsEnum`	`filterPrefixCodedLongs(TermsEnum termsEnum)` Filters the given `TermsEnum` by accepting only prefix coded 64 bit terms with a shift value of `0`.
`static int`	`floatToSortableInt(float val)` Converts a `float` value to a sortable signed `int`.
`static int`	`getPrefixCodedIntShift(BytesRef val)` Returns the shift value from a prefix encoded `int`.
`static int`	`getPrefixCodedLongShift(BytesRef val)` Returns the shift value from a prefix encoded `long`.
`static int`	`intToPrefixCoded(int val, int shift, BytesRef bytes)` Returns prefix coded bits after reducing the precision by `shift` bits.
`static void`	`intToPrefixCodedBytes(int val, int shift, BytesRef bytes)` Returns prefix coded bits after reducing the precision by `shift` bits.
`static int`	`longToPrefixCoded(long val, int shift, BytesRef bytes)` Returns prefix coded bits after reducing the precision by `shift` bits.
`static void`	`longToPrefixCodedBytes(long val, int shift, BytesRef bytes)` Returns prefix coded bits after reducing the precision by `shift` bits.
`static int`	`prefixCodedToInt(BytesRef val)` Returns an int from prefixCoded bytes.
`static long`	`prefixCodedToLong(BytesRef val)` Returns a long from prefixCoded bytes.
`static float`	`sortableIntToFloat(int val)` Converts a sortable `int` back to a `float`.
`static double`	`sortableLongToDouble(long val)` Converts a sortable `long` back to a `double`.
`static void`	`splitIntRange(NumericUtils.IntRangeBuilder builder, int precisionStep, int minBound, int maxBound)` Splits an int range recursively.
`static void`	`splitLongRange(NumericUtils.LongRangeBuilder builder, int precisionStep, long minBound, long maxBound)` Splits a long range recursively.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - PRECISION_STEP_DEFAULT
```
public static final int PRECISION_STEP_DEFAULT
```
    The default precision step used by IntField, FloatField, LongField, DoubleField, NumericTokenStream, NumericRangeQuery, and NumericRangeFilter.
    
    See Also:
    Constant Field Values
  - SHIFT_START_LONG
```
public static final byte SHIFT_START_LONG
```
    Longs are stored at lower precision by shifting off lower bits. The shift count is stored as SHIFT_START_LONG+shift in the first byte
    
    See Also:
    Constant Field Values
  - BUF_SIZE_LONG
```
public static final int BUF_SIZE_LONG
```
    The maximum term length (used for byte[] buffer size) for encoding long values.
    
    See Also:
    longToPrefixCodedBytes(long, int, org.apache.lucene.util.BytesRef), Constant Field Values
  - SHIFT_START_INT
```
public static final byte SHIFT_START_INT
```
    Integers are stored at lower precision by shifting off lower bits. The shift count is stored as SHIFT_START_INT+shift in the first byte
    
    See Also:
    Constant Field Values
  - BUF_SIZE_INT
```
public static final int BUF_SIZE_INT
```
    The maximum term length (used for byte[] buffer size) for encoding int values.
    
    See Also:
    intToPrefixCodedBytes(int, int, org.apache.lucene.util.BytesRef), Constant Field Values
- Method Detail
  - longToPrefixCoded
```
public static int longToPrefixCoded(long val,
                    int shift,
                    BytesRef bytes)
```
    Returns prefix coded bits after reducing the precision by shift bits. This is method is used by NumericTokenStream. After encoding, bytes.offset will always be 0.
    
    Parameters:
    val - the numeric value
    shift - how many bits to strip from the right
    bytes - will contain the encoded value
    
    Returns:
    the hash code for indexing (TermsHash)
  - intToPrefixCoded
```
public static int intToPrefixCoded(int val,
                   int shift,
                   BytesRef bytes)
```
    Returns prefix coded bits after reducing the precision by shift bits. This is method is used by NumericTokenStream. After encoding, bytes.offset will always be 0.
    
    Parameters:
    val - the numeric value
    shift - how many bits to strip from the right
    bytes - will contain the encoded value
    
    Returns:
    the hash code for indexing (TermsHash)
  - longToPrefixCodedBytes
```
public static void longToPrefixCodedBytes(long val,
                          int shift,
                          BytesRef bytes)
```
    Returns prefix coded bits after reducing the precision by shift bits. This is method is used by NumericTokenStream. After encoding, bytes.offset will always be 0.
    
    Parameters:
    val - the numeric value
    shift - how many bits to strip from the right
    bytes - will contain the encoded value
  - intToPrefixCodedBytes
```
public static void intToPrefixCodedBytes(int val,
                         int shift,
                         BytesRef bytes)
```
    Returns prefix coded bits after reducing the precision by shift bits. This is method is used by NumericTokenStream. After encoding, bytes.offset will always be 0.
    
    Parameters:
    val - the numeric value
    shift - how many bits to strip from the right
    bytes - will contain the encoded value
  - getPrefixCodedLongShift
```
public static int getPrefixCodedLongShift(BytesRef val)
```
    Returns the shift value from a prefix encoded long.
    
    Throws:
    
    NumberFormatException - if the supplied BytesRef is not correctly prefix encoded.
  - getPrefixCodedIntShift
```
public static int getPrefixCodedIntShift(BytesRef val)
```
    Returns the shift value from a prefix encoded int.
    
    Throws:
    
    NumberFormatException - if the supplied BytesRef is not correctly prefix encoded.
  - prefixCodedToLong
```
public static long prefixCodedToLong(BytesRef val)
```
    Returns a long from prefixCoded bytes. Rightmost bits will be zero for lower precision codes. This method can be used to decode a term's value.
    
    Throws:
    
    NumberFormatException - if the supplied BytesRef is not correctly prefix encoded.
    See Also:
    longToPrefixCodedBytes(long, int, org.apache.lucene.util.BytesRef)
  - prefixCodedToInt
```
public static int prefixCodedToInt(BytesRef val)
```
    Returns an int from prefixCoded bytes. Rightmost bits will be zero for lower precision codes. This method can be used to decode a term's value.
    
    Throws:
    
    NumberFormatException - if the supplied BytesRef is not correctly prefix encoded.
    See Also:
    intToPrefixCodedBytes(int, int, org.apache.lucene.util.BytesRef)
  - doubleToSortableLong
```
public static long doubleToSortableLong(double val)
```
    Converts a double value to a sortable signed long. The value is converted by getting their IEEE 754 floating-point "double format" bit layout and then some bits are swapped, to be able to compare the result as long. By this the precision is not reduced, but the value can easily used as a long. The sort order (including Double.NaN) is defined by Double.compareTo(java.lang.Double); NaN is greater than positive infinity.
    
    See Also:
    sortableLongToDouble(long)
  - sortableLongToDouble
```
public static double sortableLongToDouble(long val)
```
    Converts a sortable long back to a double.
    
    See Also:
    doubleToSortableLong(double)
  - floatToSortableInt
```
public static int floatToSortableInt(float val)
```
    Converts a float value to a sortable signed int. The value is converted by getting their IEEE 754 floating-point "float format" bit layout and then some bits are swapped, to be able to compare the result as int. By this the precision is not reduced, but the value can easily used as an int. The sort order (including Float.NaN) is defined by Float.compareTo(java.lang.Float); NaN is greater than positive infinity.
    
    See Also:
    sortableIntToFloat(int)
  - sortableIntToFloat
```
public static float sortableIntToFloat(int val)
```
    Converts a sortable int back to a float.
    
    See Also:
    floatToSortableInt(float)
  - splitLongRange
```
public static void splitLongRange(NumericUtils.LongRangeBuilder builder,
                  int precisionStep,
                  long minBound,
                  long maxBound)
```
    Splits a long range recursively. You may implement a builder that adds clauses to a BooleanQuery for each call to its NumericUtils.LongRangeBuilder.addRange(BytesRef,BytesRef) method.
    This method is used by NumericRangeQuery.
  - splitIntRange
```
public static void splitIntRange(NumericUtils.IntRangeBuilder builder,
                 int precisionStep,
                 int minBound,
                 int maxBound)
```
    Splits an int range recursively. You may implement a builder that adds clauses to a BooleanQuery for each call to its NumericUtils.IntRangeBuilder.addRange(BytesRef,BytesRef) method.
    This method is used by NumericRangeQuery.
  - filterPrefixCodedLongs
```
public static TermsEnum filterPrefixCodedLongs(TermsEnum termsEnum)
```
    Filters the given TermsEnum by accepting only prefix coded 64 bit terms with a shift value of 0.
    
    Parameters:
    termsEnum - the terms enum to filter
    
    Returns:
    a filtered TermsEnum that only returns prefix coded 64 bit terms with a shift value of 0.
  - filterPrefixCodedInts
```
public static TermsEnum filterPrefixCodedInts(TermsEnum termsEnum)
```
    Filters the given TermsEnum by accepting only prefix coded 32 bit terms with a shift value of 0.
    
    Parameters:
    termsEnum - the terms enum to filter
    
    Returns:
    a filtered TermsEnum that only returns prefix coded 32 bit terms with a shift value of 0.

Class NumericUtils

Nested Class Summary

Field Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

PRECISION_STEP_DEFAULT

SHIFT_START_LONG

BUF_SIZE_LONG

SHIFT_START_INT

BUF_SIZE_INT

Method Detail

longToPrefixCoded

intToPrefixCoded

longToPrefixCodedBytes

intToPrefixCodedBytes

getPrefixCodedLongShift

getPrefixCodedIntShift

prefixCodedToLong

prefixCodedToInt

doubleToSortableLong

sortableLongToDouble

floatToSortableInt

sortableIntToFloat

splitLongRange

splitIntRange

filterPrefixCodedLongs

filterPrefixCodedInts