@Deprecated public class Lucene40DocValuesFormat extends DocValuesFormat
Files:
compound container
compound entries
There are several many types of DocValues
with different encodings.
From the perspective of filenames, all types store their values in .dat
entries within the compound file. In the case of dereferenced/sorted types, the .dat
actually contains only the unique values, and an additional .idx file contains
pointers to these unique values.
VAR_INTS
.dat --> Header, PackedType, MinValue,
DefaultValue, PackedStreamFIXED_INTS_8
.dat --> Header, ValueSize,
Byte
maxdocFIXED_INTS_16
.dat --> Header, ValueSize,
Short
maxdocFIXED_INTS_32
.dat --> Header, ValueSize,
Int32
maxdocFIXED_INTS_64
.dat --> Header, ValueSize,
Int64
maxdocFLOAT_32
.dat --> Header, ValueSize, Float32maxdocFLOAT_64
.dat --> Header, ValueSize, Float64maxdocBYTES_FIXED_STRAIGHT
.dat --> Header, ValueSize,
(Byte
* ValueSize)maxdocBYTES_VAR_STRAIGHT
.idx --> Header, TotalBytes, AddressesBYTES_VAR_STRAIGHT
.dat --> Header,
(Byte
* variable ValueSize)maxdocBYTES_FIXED_DEREF
.idx --> Header, NumValues, AddressesBYTES_FIXED_DEREF
.dat --> Header, ValueSize,
(Byte
* ValueSize)NumValuesBYTES_VAR_DEREF
.idx --> Header, TotalVarBytes, AddressesBYTES_VAR_DEREF
.dat --> Header,
(LengthPrefix + Byte
* variable ValueSize)NumValuesBYTES_FIXED_SORTED
.idx --> Header, NumValues, OrdinalsBYTES_FIXED_SORTED
.dat --> Header, ValueSize,
(Byte
* ValueSize)NumValuesBYTES_VAR_SORTED
.idx --> Header, TotalVarBytes, Addresses, OrdinalsBYTES_VAR_SORTED
.dat --> Header,
(Byte
* variable ValueSize)NumValuesCodecHeader
Byte
Int64
PackedInts
Int32
Float.floatToRawIntBits(float)
then written as Int32
Double.doubleToRawLongBits(double)
then written as Int64
VLong
Int64
VInt
(maximum
of 2 bytes)VInt
(maximum of 2 bytes).Header+ValueSize+(ordinal*ValueSize)
because the byte length is fixed.
In the VAR_SORTED case, there is double indirection (docid -> ordinal -> address), but
an additional sentinel ordinal+address is always written (so there are NumValues+1 ordinals). To
determine the length, ord+1's address is looked up as well.BYTES_VAR_STRAIGHT BYTES_VAR_STRAIGHT
in contrast to other straight
variants uses a .idx file to improve lookup perfromance. In contrast to
BYTES_VAR_DEREF BYTES_VAR_DEREF
it doesn't apply deduplication of the document values.
Limitations:
MAX_BINARY_FIELD_LENGTH
in length.
Modifier and Type | Field and Description |
---|---|
static int |
MAX_BINARY_FIELD_LENGTH
Deprecated.
Maximum length for each binary doc values field.
|
Constructor and Description |
---|
Lucene40DocValuesFormat()
Deprecated.
Sole constructor.
|
Modifier and Type | Method and Description |
---|---|
DocValuesConsumer |
fieldsConsumer(SegmentWriteState state)
Deprecated.
Returns a
DocValuesConsumer to write docvalues to the
index. |
DocValuesProducer |
fieldsProducer(SegmentReadState state)
Deprecated.
Returns a
DocValuesProducer to read docvalues from the index. |
availableDocValuesFormats, forName, getName, reloadDocValuesFormats, toString
public static final int MAX_BINARY_FIELD_LENGTH
public Lucene40DocValuesFormat()
public DocValuesConsumer fieldsConsumer(SegmentWriteState state) throws IOException
DocValuesFormat
DocValuesConsumer
to write docvalues to the
index.fieldsConsumer
in class DocValuesFormat
IOException
public DocValuesProducer fieldsProducer(SegmentReadState state) throws IOException
DocValuesFormat
DocValuesProducer
to read docvalues from the index.
NOTE: by the time this call returns, it must hold open any files it will need to use; else, those files may be deleted. Additionally, required files may be deleted during the execution of this call before there is a chance to open them. Under these circumstances an IOException should be thrown by the implementation. IOExceptions are expected and will automatically cause a retry of the segment opening logic with the newly revised segments.
fieldsProducer
in class DocValuesFormat
IOException
Copyright © 2000-2014 Apache Software Foundation. All Rights Reserved.