org.apache.lucene.util.ScalarQuantizer

public class ScalarQuantizer extends Object

Will scalar quantize float vectors into `int8` byte values. This is a lossy transformation. Scalar quantization works by first calculating the quantiles of the float vector values. The quantiles are calculated using the configured confidence interval. The [minQuantile, maxQuantile] are then used to scale the values into the range [0, 127] and bucketed into the nearest byte values.

How Scalar Quantization Works

The basic mathematical equations behind this are fairly straight forward and based on min/max normalization. Given a float vector `v` and a confidenceInterval `q` we can calculate the quantiles of the vector values [minQuantile, maxQuantile].

   byte = (float - minQuantile) * 127/(maxQuantile - minQuantile)
   float = (maxQuantile - minQuantile)/127 * byte + minQuantile

This then means to multiply two float values together (e.g. dot_product) we can do the following:

   float1 * float2 ~= (byte1 * (maxQuantile - minQuantile)/127 + minQuantile) * (byte2 * (maxQuantile - minQuantile)/127 + minQuantile)
   float1 * float2 ~= (byte1 * byte2 * (maxQuantile - minQuantile)^2)/(127^2) + (byte1 * minQuantile * (maxQuantile - minQuantile)/127) + (byte2 * minQuantile * (maxQuantile - minQuantile)/127) + minQuantile^2
   let alpha = (maxQuantile - minQuantile)/127
   float1 * float2 ~= (byte1 * byte2 * alpha^2) + (byte1 * minQuantile * alpha) + (byte2 * minQuantile * alpha) + minQuantile^2

The expansion for square distance is much simpler:

  square_distance = (float1 - float2)^2
  (float1 - float2)^2 ~= (byte1 * alpha + minQuantile - byte2 * alpha - minQuantile)^2
  = (alpha*byte1 + minQuantile)^2 + (alpha*byte2 + minQuantile)^2 - 2*(alpha*byte1 + minQuantile)(alpha*byte2 + minQuantile)
  this can be simplified to:
  = alpha^2 (byte1 - byte2)^2

Field Summary

Fields

Modifier and Type

Field

Description

static final int

SCALAR_QUANTIZATION_SAMPLE_SIZE
Constructor Summary

Constructors

Constructor

Description

ScalarQuantizer(float minQuantile, float maxQuantile, float confidenceInterval)
Method Summary

Modifier and Type

Method

Description

void

deQuantize(byte[] src, float[] dest)

Dequantize a byte vector into a float vector

static ScalarQuantizer

fromVectors(FloatVectorValues floatVectorValues, float confidenceInterval)

This will read the float vector values and calculate the quantiles.

float

getConfidenceInterval()

float

getConstantMultiplier()

float

getLowerQuantile()

float

getUpperQuantile()

float

quantize(float[] src, byte[] dest, VectorSimilarityFunction similarityFunction)

Quantize a float vector into a byte vector

float

recalculateCorrectiveOffset(byte[] quantizedVector, ScalarQuantizer oldQuantizer, VectorSimilarityFunction similarityFunction)

Recalculate the old score corrective value given new current quantiles

String

toString()

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

Field Details
- SCALAR_QUANTIZATION_SAMPLE_SIZE
  
  public static final int SCALAR_QUANTIZATION_SAMPLE_SIZE
  See Also:
  
  Constant Field Values
Constructor Details
- ScalarQuantizer
  
  public ScalarQuantizer(float minQuantile, float maxQuantile, float confidenceInterval)
  
  Parameters:
  
  minQuantile - the lower quantile of the distribution
  
  maxQuantile - the upper quantile of the distribution
  
  confidenceInterval - The configured confidence interval used to calculate the quantiles.
Method Details
- quantize
  
  public float quantize(float[] src, byte[] dest, VectorSimilarityFunction similarityFunction)
  
  Quantize a float vector into a byte vector
  
  Parameters:
  
  src - the source vector
  
  dest - the destination vector
  
  similarityFunction - the similarity function used to calculate the quantile
  
  Returns:
  
  the corrective offset that needs to be applied to the score
- recalculateCorrectiveOffset
  
  public float recalculateCorrectiveOffset(byte[] quantizedVector, ScalarQuantizer oldQuantizer, VectorSimilarityFunction similarityFunction)
  
  Recalculate the old score corrective value given new current quantiles
  
  Parameters:
  
  quantizedVector - the old vector
  
  oldQuantizer - the old quantizer
  
  similarityFunction - the similarity function used to calculate the quantile
  
  Returns:
  
  the new offset
- deQuantize
  
  public void deQuantize(byte[] src, float[] dest)
  
  Dequantize a byte vector into a float vector
  
  Parameters:
  
  src - the source vector
  
  dest - the destination vector
- getLowerQuantile
  
  public float getLowerQuantile()
- getUpperQuantile
  
  public float getUpperQuantile()
- getConfidenceInterval
  
  public float getConfidenceInterval()
- getConstantMultiplier
  
  public float getConstantMultiplier()
- toString
  
  public String toString()
  
  Overrides:
  
  toString in class Object
- fromVectors
  
  public static ScalarQuantizer fromVectors(FloatVectorValues floatVectorValues, float confidenceInterval) throws IOException
  
  This will read the float vector values and calculate the quantiles. If the number of float vectors is less than SCALAR_QUANTIZATION_SAMPLE_SIZE then all the values will be read and the quantiles calculated. If the number of float vectors is greater than SCALAR_QUANTIZATION_SAMPLE_SIZE then a random sample of SCALAR_QUANTIZATION_SAMPLE_SIZE will be read and the quantiles calculated.
  
  Parameters:
  
  floatVectorValues - the float vector values from which to calculate the quantiles
  
  confidenceInterval - the confidence interval used to calculate the quantiles
  
  Returns:
  
  A new ScalarQuantizer instance
  
  Throws:
  
  IOException - if there is an error reading the float vector values

Class ScalarQuantizer

How Scalar Quantization Works

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Details

SCALAR_QUANTIZATION_SAMPLE_SIZE

Constructor Details

ScalarQuantizer

Method Details

quantize

recalculateCorrectiveOffset

deQuantize

getLowerQuantile

getUpperQuantile

getConfidenceInterval

getConstantMultiplier

toString

fromVectors