Class OrdToDocDISIReaderConfiguration
java.lang.Object
org.apache.lucene.codecs.lucene95.OrdToDocDISIReaderConfiguration
- All Implemented Interfaces:
Accountable
Configuration for
DirectMonotonicReader
and IndexedDISI
for reading sparse
vectors. The format in the static writing methods adheres to the Lucene95HnswVectorsFormat-
Field Summary
Fields inherited from interface org.apache.lucene.util.Accountable
NULL_ACCOUNTABLE
-
Method Summary
Modifier and TypeMethodDescriptionfromStoredMeta
(IndexInput inputMeta, int size) Reads in the necessary fields stored in the outputMeta to configureDirectMonotonicReader
andIndexedDISI
.getDirectMonotonicReader
(IndexInput dataIn) getIndexedDISI
(IndexInput dataIn) boolean
isDense()
boolean
isEmpty()
long
Return the memory usage of this object in bytes.static void
writeStoredMeta
(int directMonotonicBlockShift, IndexOutput outputMeta, IndexOutput vectorData, int count, int maxDoc, DocsWithFieldSet docsWithField) Writes out the docsWithField and ordToDoc mapping to the outputMeta and vectorData respectively.Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.apache.lucene.util.Accountable
getChildResources
-
Method Details
-
writeStoredMeta
public static void writeStoredMeta(int directMonotonicBlockShift, IndexOutput outputMeta, IndexOutput vectorData, int count, int maxDoc, DocsWithFieldSet docsWithField) throws IOException Writes out the docsWithField and ordToDoc mapping to the outputMeta and vectorData respectively. This is in adherence to the Lucene95HnswVectorsFormat.Within outputMeta the format is as follows:
- [int8] if equals to -2, empty - no vectory values. If equals to -1, dense – all documents have values for a field. If equals to 0, sparse – some documents missing values.
- DocIds were encoded by
IndexedDISI.writeBitSet(DocIdSetIterator, IndexOutput, byte)
- OrdToDoc was encoded by
DirectMonotonicWriter
, note that only in sparse case
Within the vectorData the format is as follows:
- DocIds encoded by
IndexedDISI.writeBitSet(DocIdSetIterator, IndexOutput, byte)
, note that only in sparse case - OrdToDoc was encoded by
DirectMonotonicWriter
, note that only in sparse case
- Parameters:
outputMeta
- the outputMetavectorData
- the vectorDatacount
- the count of docs with vectorsmaxDoc
- the maxDoc for the indexdocsWithField
- the docs contaiting a vector field- Throws:
IOException
- thrown when writing data fails to either output
-
fromStoredMeta
public static OrdToDocDISIReaderConfiguration fromStoredMeta(IndexInput inputMeta, int size) throws IOException Reads in the necessary fields stored in the outputMeta to configureDirectMonotonicReader
andIndexedDISI
.- Parameters:
inputMeta
- the inputMeta, previously written to viawriteStoredMeta(int, IndexOutput, IndexOutput, int, int, DocsWithFieldSet)
size
- The number of vectors- Returns:
- the configuration required to read sparse vectors
- Throws:
IOException
- thrown when reading data fails
-
ramBytesUsed
public long ramBytesUsed()Description copied from interface:Accountable
Return the memory usage of this object in bytes. Negative values are illegal.- Specified by:
ramBytesUsed
in interfaceAccountable
-
getIndexedDISI
- Parameters:
dataIn
- the dataIn- Returns:
- the IndexedDISI for sparse values
- Throws:
IOException
- thrown when reading data fails
-
getDirectMonotonicReader
- Parameters:
dataIn
- the dataIn- Returns:
- the DirectMonotonicReader for sparse values
- Throws:
IOException
- thrown when reading data fails
-
isEmpty
public boolean isEmpty()- Returns:
- If true, the field is empty, no vector values. If false, the field is either dense or sparse.
-
isDense
public boolean isDense()- Returns:
- If true, the field is dense, all documents have values for a field. If false, the field is sparse, some documents missing values.
-