org.apache.lucene.util.quantization.OptimizedScalarQuantizer

public class OptimizedScalarQuantizer extends Object

This is a scalar quantizer that optimizes the quantization intervals for a given vector. This is done by optimizing the quantiles of the vector centered on a provided centroid. The optimization is done by minimizing the quantization loss via coordinate descent.

Local vector quantization parameters was originally proposed with LVQ in Similarity search in the blink of an eye with compressed indices This technique builds on LVQ, but instead of taking the min/max values, a grid search over the centered vector is done to find the optimal quantization intervals, taking into account anisotropic loss.

Anisotropic loss is first discussed in depth by Accelerating Large-Scale Inference with Anisotropic Vector Quantization by Ruiqi Guo, et al.

WARNING: This API is experimental and might change in incompatible ways in the next release.

Nested Class Summary

Nested Classes

Modifier and Type

Class

Description

static final record

OptimizedScalarQuantizer.QuantizationResult

Quantization result containing the lower and upper interval bounds, the additional correction
Constructor Summary

Constructors

Constructor

Description

OptimizedScalarQuantizer(VectorSimilarityFunction similarityFunction)

Create a new scalar quantizer with the default lambda and number of iterations.

OptimizedScalarQuantizer(VectorSimilarityFunction similarityFunction, float lambda, int iters)

Create a new scalar quantizer with the given similarity function, lambda, and number of iterations.
Method Summary

Modifier and Type

Method

Description

static int

discretize(int value, int bucket)

OptimizedScalarQuantizer.QuantizationResult[]

multiScalarQuantize(float[] vector, byte[][] destinations, byte[] bits, float[] centroid)

Quantize the vector to the multiple bit levels.

static void

packAsBinary(byte[] vector, byte[] packed)

Pack the vector as a binary array.

OptimizedScalarQuantizer.QuantizationResult

scalarQuantize(float[] vector, byte[] destination, byte bits, float[] centroid)

Quantize the vector to the given bit level.

static void

transposeHalfByte(byte[] q, byte[] quantQueryByte)

Transpose the query vector into a byte array allowing for efficient bitwise operations with the index bit vectors.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- OptimizedScalarQuantizer
  
  public OptimizedScalarQuantizer(VectorSimilarityFunction similarityFunction, float lambda, int iters)
  
  Create a new scalar quantizer with the given similarity function, lambda, and number of iterations.
  
  Parameters:
  
  similarityFunction - similarity function to use
  
  lambda - lambda value to use
  
  iters - number of iterations to use
- OptimizedScalarQuantizer
  
  public OptimizedScalarQuantizer(VectorSimilarityFunction similarityFunction)
  
  Create a new scalar quantizer with the default lambda and number of iterations.
  
  Parameters:
  
  similarityFunction - similarity function to use
Method Details
- multiScalarQuantize
  
  public OptimizedScalarQuantizer.QuantizationResult[] multiScalarQuantize(float[] vector, byte[][] destinations, byte[] bits, float[] centroid)
  
  Quantize the vector to the multiple bit levels.
  
  Parameters:
  
  vector - raw vector
  
  destinations - array of destinations to store the quantized vector
  
  bits - array of bits to quantize the vector
  
  centroid - centroid to center the vector
  
  Returns:
  
  array of quantization results
- scalarQuantize
  
  public OptimizedScalarQuantizer.QuantizationResult scalarQuantize(float[] vector, byte[] destination, byte bits, float[] centroid)
  
  Quantize the vector to the given bit level.
  
  Parameters:
  
  vector - raw vector
  
  destination - destination to store the quantized vector
  
  bits - number of bits to quantize the vector
  
  centroid - centroid to center the vector
  
  Returns:
  
  quantization result
- discretize
  
  public static int discretize(int value, int bucket)
- transposeHalfByte
  
  public static void transposeHalfByte(byte[] q, byte[] quantQueryByte)
  Transpose the query vector into a byte array allowing for efficient bitwise operations with the index bit vectors. The idea here is to organize the query vector bits such that the first bit of every dimension is in the first set dimensions bits, or (dimensions/8) bytes. The second, third, and fourth bits are in the second, third, and fourth set of dimensions bits, respectively. This allows for direct bitwise comparisons with the stored index vectors through summing the bitwise results with the relative required bit shifts.
  This bit decomposition for fast bitwise SIMD operations was first proposed in:
  Gao, Jianyang, and Cheng Long. "RaBitQ: Quantizing High- Dimensional Vectors with a Theoretical Error Bound for Approximate Nearest Neighbor Search." Proceedings of the ACM on Management of Data 2, no. 3 (2024): 1-27.
  Parameters:
  
  q - the query vector, assumed to be half-byte quantized with values between 0 and 15
  
  quantQueryByte - the byte array to store the transposed query vector
- packAsBinary
  
  public static void packAsBinary(byte[] vector, byte[] packed)
  
  Pack the vector as a binary array.
  
  Parameters:
  
  vector - the vector to pack
  
  packed - the packed vector

Class OptimizedScalarQuantizer

Nested Class Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

OptimizedScalarQuantizer

OptimizedScalarQuantizer

Method Details

multiScalarQuantize

scalarQuantize

discretize

transposeHalfByte

packAsBinary