Class ScalarQuantizer


  • public class ScalarQuantizer
    extends Object
    Will scalar quantize float vectors into `int8` byte values. This is a lossy transformation. Scalar quantization works by first calculating the quantiles of the float vector values. The quantiles are calculated using the configured confidence interval. The [minQuantile, maxQuantile] are then used to scale the values into the range [0, 127] and bucketed into the nearest byte values.

    How Scalar Quantization Works

    The basic mathematical equations behind this are fairly straight forward and based on min/max normalization. Given a float vector `v` and a confidenceInterval `q` we can calculate the quantiles of the vector values [minQuantile, maxQuantile].

       byte = (float - minQuantile) * 127/(maxQuantile - minQuantile)
       float = (maxQuantile - minQuantile)/127 * byte + minQuantile
     

    This then means to multiply two float values together (e.g. dot_product) we can do the following:

       float1 * float2 ~= (byte1 * (maxQuantile - minQuantile)/127 + minQuantile) * (byte2 * (maxQuantile - minQuantile)/127 + minQuantile)
       float1 * float2 ~= (byte1 * byte2 * (maxQuantile - minQuantile)^2)/(127^2) + (byte1 * minQuantile * (maxQuantile - minQuantile)/127) + (byte2 * minQuantile * (maxQuantile - minQuantile)/127) + minQuantile^2
       let alpha = (maxQuantile - minQuantile)/127
       float1 * float2 ~= (byte1 * byte2 * alpha^2) + (byte1 * minQuantile * alpha) + (byte2 * minQuantile * alpha) + minQuantile^2
     

    The expansion for square distance is much simpler:

      square_distance = (float1 - float2)^2
      (float1 - float2)^2 ~= (byte1 * alpha + minQuantile - byte2 * alpha - minQuantile)^2
      = (alpha*byte1 + minQuantile)^2 + (alpha*byte2 + minQuantile)^2 - 2*(alpha*byte1 + minQuantile)(alpha*byte2 + minQuantile)
      this can be simplified to:
      = alpha^2 (byte1 - byte2)^2
     
    • Field Detail

      • SCALAR_QUANTIZATION_SAMPLE_SIZE

        public static final int SCALAR_QUANTIZATION_SAMPLE_SIZE
        See Also:
        Constant Field Values
    • Constructor Detail

      • ScalarQuantizer

        public ScalarQuantizer​(float minQuantile,
                               float maxQuantile,
                               float confidenceInterval)
        Parameters:
        minQuantile - the lower quantile of the distribution
        maxQuantile - the upper quantile of the distribution
        confidenceInterval - The configured confidence interval used to calculate the quantiles.
    • Method Detail

      • quantize

        public float quantize​(float[] src,
                              byte[] dest,
                              VectorSimilarityFunction similarityFunction)
        Quantize a float vector into a byte vector
        Parameters:
        src - the source vector
        dest - the destination vector
        similarityFunction - the similarity function used to calculate the quantile
        Returns:
        the corrective offset that needs to be applied to the score
      • recalculateCorrectiveOffset

        public float recalculateCorrectiveOffset​(byte[] quantizedVector,
                                                 ScalarQuantizer oldQuantizer,
                                                 VectorSimilarityFunction similarityFunction)
        Recalculate the old score corrective value given new current quantiles
        Parameters:
        quantizedVector - the old vector
        oldQuantizer - the old quantizer
        similarityFunction - the similarity function used to calculate the quantile
        Returns:
        the new offset
      • deQuantize

        public void deQuantize​(byte[] src,
                               float[] dest)
        Dequantize a byte vector into a float vector
        Parameters:
        src - the source vector
        dest - the destination vector
      • getLowerQuantile

        public float getLowerQuantile()
      • getUpperQuantile

        public float getUpperQuantile()
      • getConfidenceInterval

        public float getConfidenceInterval()
      • getConstantMultiplier

        public float getConstantMultiplier()
      • fromVectors

        public static ScalarQuantizer fromVectors​(FloatVectorValues floatVectorValues,
                                                  float confidenceInterval,
                                                  int totalVectorCount)
                                           throws IOException
        This will read the float vector values and calculate the quantiles. If the number of float vectors is less than SCALAR_QUANTIZATION_SAMPLE_SIZE then all the values will be read and the quantiles calculated. If the number of float vectors is greater than SCALAR_QUANTIZATION_SAMPLE_SIZE then a random sample of SCALAR_QUANTIZATION_SAMPLE_SIZE will be read and the quantiles calculated.
        Parameters:
        floatVectorValues - the float vector values from which to calculate the quantiles
        confidenceInterval - the confidence interval used to calculate the quantiles
        totalVectorCount - the total number of live float vectors in the index. This is vital for accounting for deleted documents when calculating the quantiles.
        Returns:
        A new ScalarQuantizer instance
        Throws:
        IOException - if there is an error reading the float vector values