public final class Lucene45DocValuesFormat extends DocValuesFormat
Encodes the four per-document value types (Numeric,Binary,Sorted,SortedSet) with these strategies:
BlockPackedWriter.
SmallFloat),
a lookup table is written instead. Each per-document entry is instead the ordinal
to this table, and those ordinals are compressed with bitpacking (PackedInts).
docID * length).
Files:
The DocValues metadata or .dvm file.
For DocValues field, this stores metadata, such as the offset into the DocValues data (.dvd)
DocValues metadata (.dvm) --> Header,<Entry>NumFields
Int64TableSizeVIntByteCodecHeaderInt64vIntSorted fields have two entries: a BinaryEntry with the value metadata, and an ordinary NumericEntry for the document-to-ord metadata.
SortedSet fields have three entries: a BinaryEntry with the value metadata, and two NumericEntries for the document-to-ord-index and ordinal list metadata.
FieldNumber of -1 indicates the end of metadata.
EntryType is a 0 (NumericEntry) or 1 (BinaryEntry)
DataOffset is the pointer to the start of the data in the DocValues data (.dvd)
NumericType indicates how Numeric values will be compressed:
BinaryType indicates how Binary values will be stored:
MinLength and MaxLength represent the min and max byte[] value lengths for Binary values. If they are equal, then all values are of a fixed size, and can be addressed as DataOffset + (docID * length). Otherwise, the binary values are of variable size, and packed integer metadata (PackedVersion,BlockSize) is written for the addresses.
MissingOffset points to a byte[] containing a bitset of all documents that had a value for the field. If its -1, then there are no missing values.
The DocValues data or .dvd file.
For DocValues field, this stores the actual per-document data (the heavy-lifting)
DocValues data (.dvd) --> Header,<NumericData | BinaryData | SortedData>NumFields
ByteDataLength,AddressesFST<Int64>BlockPackedInts(blockSize=16k)PackedIntsBlockPackedInts(blockSize=16k)MonotonicBlockPackedInts(blockSize=16k)SortedSet entries store the list of ordinals in their BinaryData as a
sequences of increasing vLongs, delta-encoded.
| Constructor and Description |
|---|
Lucene45DocValuesFormat()
Sole Constructor
|
| Modifier and Type | Method and Description |
|---|---|
DocValuesConsumer |
fieldsConsumer(SegmentWriteState state)
Returns a
DocValuesConsumer to write docvalues to the
index. |
DocValuesProducer |
fieldsProducer(SegmentReadState state)
Returns a
DocValuesProducer to read docvalues from the index. |
availableDocValuesFormats, forName, getName, reloadDocValuesFormats, toStringpublic DocValuesConsumer fieldsConsumer(SegmentWriteState state) throws IOException
DocValuesFormatDocValuesConsumer to write docvalues to the
index.fieldsConsumer in class DocValuesFormatIOExceptionpublic DocValuesProducer fieldsProducer(SegmentReadState state) throws IOException
DocValuesFormatDocValuesProducer to read docvalues from the index.
NOTE: by the time this call returns, it must hold open any files it will need to use; else, those files may be deleted. Additionally, required files may be deleted during the execution of this call before there is a chance to open them. Under these circumstances an IOException should be thrown by the implementation. IOExceptions are expected and will automatically cause a retry of the segment opening logic with the newly revised segments.
fieldsProducer in class DocValuesFormatIOExceptionCopyright © 2000-2014 Apache Software Foundation. All Rights Reserved.