Class Lucene99FlatVectorsFormat

java.lang.Object
org.apache.lucene.codecs.FlatVectorsFormat
org.apache.lucene.codecs.lucene99.Lucene99FlatVectorsFormat

public final class Lucene99FlatVectorsFormat extends FlatVectorsFormat
Lucene 9.9 flat vector format, which encodes numeric vector values

.vec (vector data) file

For each field:

  • Vector data ordered by field, document ordinal, and vector dimension. When the vectorEncoding is BYTE, each sample is stored as a single byte. When it is FLOAT32, each sample is stored as an IEEE float in little-endian byte order.
  • DocIds encoded by IndexedDISI.writeBitSet(DocIdSetIterator, IndexOutput, byte), note that only in sparse case
  • OrdToDoc was encoded by DirectMonotonicWriter, note that only in sparse case

.vemf (vector metadata) file

For each field:

  • [int32] field number
  • [int32] vector similarity function ordinal
  • [vlong] offset to this field's vectors in the .vec file
  • [vlong] length of this field's vectors, in bytes
  • [vint] dimension of this field's vectors
  • [int] the number of documents having values for this field
  • [int8] if equals to -1, dense – all documents have values for a field. If equals to 0, sparse – some documents missing values.
  • DocIds were encoded by IndexedDISI.writeBitSet(DocIdSetIterator, IndexOutput, byte)
  • OrdToDoc was encoded by DirectMonotonicWriter, note that only in sparse case
WARNING: This API is experimental and might change in incompatible ways in the next release.