Class ScalarQuantizer

java.lang.Object
org.apache.lucene.util.ScalarQuantizer

public class ScalarQuantizer extends Object
Will scalar quantize float vectors into `int8` byte values. This is a lossy transformation. Scalar quantization works by first calculating the quantiles of the float vector values. The quantiles are calculated using the configured confidence interval. The [minQuantile, maxQuantile] are then used to scale the values into the range [0, 127] and bucketed into the nearest byte values.

How Scalar Quantization Works

The basic mathematical equations behind this are fairly straight forward and based on min/max normalization. Given a float vector `v` and a confidenceInterval `q` we can calculate the quantiles of the vector values [minQuantile, maxQuantile].

   byte = (float - minQuantile) * 127/(maxQuantile - minQuantile)
   float = (maxQuantile - minQuantile)/127 * byte + minQuantile
 

This then means to multiply two float values together (e.g. dot_product) we can do the following:

   float1 * float2 ~= (byte1 * (maxQuantile - minQuantile)/127 + minQuantile) * (byte2 * (maxQuantile - minQuantile)/127 + minQuantile)
   float1 * float2 ~= (byte1 * byte2 * (maxQuantile - minQuantile)^2)/(127^2) + (byte1 * minQuantile * (maxQuantile - minQuantile)/127) + (byte2 * minQuantile * (maxQuantile - minQuantile)/127) + minQuantile^2
   let alpha = (maxQuantile - minQuantile)/127
   float1 * float2 ~= (byte1 * byte2 * alpha^2) + (byte1 * minQuantile * alpha) + (byte2 * minQuantile * alpha) + minQuantile^2
 

The expansion for square distance is much simpler:

  square_distance = (float1 - float2)^2
  (float1 - float2)^2 ~= (byte1 * alpha + minQuantile - byte2 * alpha - minQuantile)^2
  = (alpha*byte1 + minQuantile)^2 + (alpha*byte2 + minQuantile)^2 - 2*(alpha*byte1 + minQuantile)(alpha*byte2 + minQuantile)
  this can be simplified to:
  = alpha^2 (byte1 - byte2)^2
 
  • Field Details

    • SCALAR_QUANTIZATION_SAMPLE_SIZE

      public static final int SCALAR_QUANTIZATION_SAMPLE_SIZE
      See Also:
  • Constructor Details

    • ScalarQuantizer

      public ScalarQuantizer(float minQuantile, float maxQuantile, float confidenceInterval)
      Parameters:
      minQuantile - the lower quantile of the distribution
      maxQuantile - the upper quantile of the distribution
      confidenceInterval - The configured confidence interval used to calculate the quantiles.
  • Method Details

    • quantize

      public float quantize(float[] src, byte[] dest, VectorSimilarityFunction similarityFunction)
      Quantize a float vector into a byte vector
      Parameters:
      src - the source vector
      dest - the destination vector
      similarityFunction - the similarity function used to calculate the quantile
      Returns:
      the corrective offset that needs to be applied to the score
    • recalculateCorrectiveOffset

      public float recalculateCorrectiveOffset(byte[] quantizedVector, ScalarQuantizer oldQuantizer, VectorSimilarityFunction similarityFunction)
      Recalculate the old score corrective value given new current quantiles
      Parameters:
      quantizedVector - the old vector
      oldQuantizer - the old quantizer
      similarityFunction - the similarity function used to calculate the quantile
      Returns:
      the new offset
    • deQuantize

      public void deQuantize(byte[] src, float[] dest)
      Dequantize a byte vector into a float vector
      Parameters:
      src - the source vector
      dest - the destination vector
    • getLowerQuantile

      public float getLowerQuantile()
    • getUpperQuantile

      public float getUpperQuantile()
    • getConfidenceInterval

      public float getConfidenceInterval()
    • getConstantMultiplier

      public float getConstantMultiplier()
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • fromVectors

      public static ScalarQuantizer fromVectors(FloatVectorValues floatVectorValues, float confidenceInterval) throws IOException
      This will read the float vector values and calculate the quantiles. If the number of float vectors is less than SCALAR_QUANTIZATION_SAMPLE_SIZE then all the values will be read and the quantiles calculated. If the number of float vectors is greater than SCALAR_QUANTIZATION_SAMPLE_SIZE then a random sample of SCALAR_QUANTIZATION_SAMPLE_SIZE will be read and the quantiles calculated.
      Parameters:
      floatVectorValues - the float vector values from which to calculate the quantiles
      confidenceInterval - the confidence interval used to calculate the quantiles
      Returns:
      A new ScalarQuantizer instance
      Throws:
      IOException - if there is an error reading the float vector values