Class ScalarQuantizer

java.lang.Object
org.apache.lucene.util.quantization.ScalarQuantizer

public class ScalarQuantizer extends Object
Will scalar quantize float vectors into `int8` byte values. This is a lossy transformation. Scalar quantization works by first calculating the quantiles of the float vector values. The quantiles are calculated using the configured confidence interval. The [minQuantile, maxQuantile] are then used to scale the values into the range [0, 127] and bucketed into the nearest byte values.

How Scalar Quantization Works

The basic mathematical equations behind this are fairly straight forward and based on min/max normalization. Given a float vector `v` and a confidenceInterval `q` we can calculate the quantiles of the vector values [minQuantile, maxQuantile].

   byte = (float - minQuantile) * 127/(maxQuantile - minQuantile)
   float = (maxQuantile - minQuantile)/127 * byte + minQuantile
 

This then means to multiply two float values together (e.g. dot_product) we can do the following:

   float1 * float2 ~= (byte1 * (maxQuantile - minQuantile)/127 + minQuantile) * (byte2 * (maxQuantile - minQuantile)/127 + minQuantile)
   float1 * float2 ~= (byte1 * byte2 * (maxQuantile - minQuantile)^2)/(127^2) + (byte1 * minQuantile * (maxQuantile - minQuantile)/127) + (byte2 * minQuantile * (maxQuantile - minQuantile)/127) + minQuantile^2
   let alpha = (maxQuantile - minQuantile)/127
   float1 * float2 ~= (byte1 * byte2 * alpha^2) + (byte1 * minQuantile * alpha) + (byte2 * minQuantile * alpha) + minQuantile^2
 

The expansion for square distance is much simpler:

  square_distance = (float1 - float2)^2
  (float1 - float2)^2 ~= (byte1 * alpha + minQuantile - byte2 * alpha - minQuantile)^2
  = (alpha*byte1 + minQuantile)^2 + (alpha*byte2 + minQuantile)^2 - 2*(alpha*byte1 + minQuantile)(alpha*byte2 + minQuantile)
  this can be simplified to:
  = alpha^2 (byte1 - byte2)^2
 
  • Field Details

    • SCALAR_QUANTIZATION_SAMPLE_SIZE

      public static final int SCALAR_QUANTIZATION_SAMPLE_SIZE
      See Also:
  • Constructor Details

    • ScalarQuantizer

      public ScalarQuantizer(float minQuantile, float maxQuantile, byte bits)
      Parameters:
      minQuantile - the lower quantile of the distribution
      maxQuantile - the upper quantile of the distribution
      bits - the number of bits to use for quantization
  • Method Details

    • quantize

      public float quantize(float[] src, byte[] dest, VectorSimilarityFunction similarityFunction)
      Quantize a float vector into a byte vector
      Parameters:
      src - the source vector
      dest - the destination vector
      similarityFunction - the similarity function used to calculate the quantile
      Returns:
      the corrective offset that needs to be applied to the score
    • recalculateCorrectiveOffset

      public float recalculateCorrectiveOffset(byte[] quantizedVector, ScalarQuantizer oldQuantizer, VectorSimilarityFunction similarityFunction)
      Recalculate the old score corrective value given new current quantiles
      Parameters:
      quantizedVector - the old vector
      oldQuantizer - the old quantizer
      similarityFunction - the similarity function used to calculate the quantile
      Returns:
      the new offset
    • getLowerQuantile

      public float getLowerQuantile()
    • getUpperQuantile

      public float getUpperQuantile()
    • getConstantMultiplier

      public float getConstantMultiplier()
    • getBits

      public byte getBits()
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • fromVectors

      public static ScalarQuantizer fromVectors(FloatVectorValues floatVectorValues, float confidenceInterval, int totalVectorCount, byte bits) throws IOException
      This will read the float vector values and calculate the quantiles. If the number of float vectors is less than SCALAR_QUANTIZATION_SAMPLE_SIZE then all the values will be read and the quantiles calculated. If the number of float vectors is greater than SCALAR_QUANTIZATION_SAMPLE_SIZE then a random sample of SCALAR_QUANTIZATION_SAMPLE_SIZE will be read and the quantiles calculated.
      Parameters:
      floatVectorValues - the float vector values from which to calculate the quantiles
      confidenceInterval - the confidence interval used to calculate the quantiles
      totalVectorCount - the total number of live float vectors in the index. This is vital for accounting for deleted documents when calculating the quantiles.
      bits - the number of bits to use for quantization
      Returns:
      A new ScalarQuantizer instance
      Throws:
      IOException - if there is an error reading the float vector values
    • fromVectorsAutoInterval

      public static ScalarQuantizer fromVectorsAutoInterval(FloatVectorValues floatVectorValues, VectorSimilarityFunction function, int totalVectorCount, byte bits) throws IOException
      Throws:
      IOException