All Classes and Interfaces
Class
Description
Decodes the raw bytes of a block when the index is read, according to the
BlockEncoder
used during the writing of the index.Encodes the raw bytes of a block when the index is written.
Writable byte buffer.
Block header containing block metadata.
Reads/writes block header.
One term block line.
Reads/writes block lines with terms encoded incrementally inside a block.
Seeks the block corresponding to a given term, read the block bytes, and scans the block terms.
Handles a terms dict, but decouples all details of doc/freqs/positions reading to an instance of
PostingsReaderBase
.Writes terms dict, block-encoding (column stride) each term's metadata for each set of terms
between two index terms.
Uses
OrdsBlockTreeTermsWriter
with Lucene101PostingsWriter
.Writes blocks in the block file.
Class used to create index-time
FuzzySet
appropriately configured for each field.A
PostingsFormat
useful for low doc-frequency fields such as primary keys.Default policy is to allocate a bitset with 10% saturation given a unique term per document.
TermState
serializer which encodes each file pointer as a delta relative to a base file
pointer.Wraps
Lucene101PostingsFormat
format for on-disk storage, but then at read time loads and
stores all terms and postings directly in RAM as byte[], int[].Metadata and stats for one field in the index.
Reads/writes field metadata.
Pair of
FieldMetadata
and BlockTermState
for a specific field.TermsIndexReader for simple every Nth terms indexes.
Selects every Nth term as and index term, and hold term bytes (mostly) fully expanded in memory.
A bit vector scorer for scoring byte vectors.
Immutable stateless
FST
-based index dictionary kept in memory.Provides stateful
FSTDictionary.Browser
to seek in the FSTDictionary
.Builds an immutable
FSTDictionary
.FST term dict + Lucene50PBF
FST-based terms dictionary reader.
FST-based term dict, using metadata as FST output.
A class used to represent a set of many, potentially large, values (e.g.
Result from
FuzzySet.contains(BytesRef)
: can never return definitively YES (always
MAYBE), but can sometimes definitely return NO.Base class for hashing functions that can be referred to by name.
Encodes bit vector values into an associated graph connecting the documents having values.
Immutable stateless index dictionary kept in RAM.
Stateful
IndexDictionary.Browser
to seek a term in this IndexDictionary
and get
its corresponding block file pointer in the block file.Supplier for a new stateful
IndexDictionary.Browser
created on the immutable IndexDictionary
.Builds an immutable
IndexDictionary
.The "intersect"
TermsEnum
response to UniformSplitTerms.intersect(CompiledAutomaton, BytesRef)
, intersecting the terms with an
automaton.Block iteration order.
This is a very fast, non-cryptographic hash suitable for general hash-based lookup.
This is just like
Lucene90BlockTreeTermsWriter
, except it also stores a version per term,
and adds a method to its TermsEnum implementation to seekExact only if the version is >= the
specified version.Iterates through terms in this field.
Utility methods to estimate the RAM usage of objects.
plain text index format.
plain text compound format.
plaintext field infos format
For debugging, curiosity, transparency only!! Do not use this codec in production.
Reads vector values from a simple text format.
Writes vector-valued fields in a plain text format
reads/writes plaintext live docs
plain-text norms format.
Writes plain-text norms.
Reads plain-text norms.
For debugging, curiosity, transparency only!! Do not use this codec in production.
plain text segments file format.
plain text stored fields format.
reads plaintext stored fields
Writes plain-text stored fields.
plain text term vectors format.
Reads plain-text term vectors.
Writes plain-text term vectors.
Represents a term and its details stored in the
BlockTermState
.Reads block lines encoded incrementally, with all fields corresponding to the term of the line.
Reads terms blocks with the Shared Terms format.
Writes terms blocks with the Shared Terms format.
The "intersect"
TermsEnum
response to STUniformSplitTerms.intersect(CompiledAutomaton, BytesRef)
, intersecting the terms with an
automaton.PostingsFormat
based on the Uniform Split technique and supporting Shared Terms.Extends
UniformSplitTerms
for a shared-terms dictionary, with all the fields of a term in
the same block line.A block-based terms index and dictionary based on the Uniform Split technique, and sharing all
the fields terms in the same dictionary, with all the fields of a term in the same block line.
Extends
UniformSplitTermsWriter
by sharing all the fields terms in the same dictionary
and by writing all the fields of a term in the same block line.Term of a block line.
BlockTermsReader
interacts with an instance of this class to manage its terms index.Similar to TermsEnum, except, the only "metadata" it reports for a given indexed term is the
long fileOffset into the main terms dictionary file.
Base class for terms index implementations to plug into
BlockTermsWriter
.PostingsFormat
based on the Uniform Split technique.Terms
based on the Uniform Split technique.A block-based terms index and dictionary based on the Uniform Split technique.
A block-based terms index and dictionary that assigns terms to nearly uniform length blocks.
Builds a
FieldMetadata
that is the union of multiple FieldMetadata
.Selects index terms according to provided pluggable
VariableGapTermsIndexWriter.IndexTermSelector
, and stores them in
a prefix trie that's loaded entirely in RAM stored as an FST.Sets an index term when docFreq >= docFreqThresh, or every interval terms.
Same policy as
FixedGapTermsIndexWriter
Hook for selecting which terms should be placed in the terms index.