Package org.apache.lucene.codecs.uniformsplit.sharedterms
package org.apache.lucene.codecs.uniformsplit.sharedterms
Pluggable term index / block terms dictionary implementations.
Extension of org.apache.lucene.codecs.uniformsplit
with Shared Terms principle: Terms
are shared between all fields. It is particularly adapted to index a massive number of fields
because all the terms are stored in a single FST dictionary.
- Designed to be extensible
- Highly reduced on-heap memory usage when dealing with a massive number of fields.
-
ClassDescriptionPair of
FieldMetadata
andBlockTermState
for a specific field.Represents a term and its details stored in theBlockTermState
.Reads block lines encoded incrementally, with all fields corresponding to the term of the line.Reads terms blocks with the Shared Terms format.Writes terms blocks with the Shared Terms format.The "intersect"TermsEnum
response toSTUniformSplitTerms.intersect(CompiledAutomaton, BytesRef)
, intersecting the terms with an automaton.CombinesPostingsEnum
for the same term for a given field from multiple segments.PostingsFormat
based on the Uniform Split technique and supporting Shared Terms.ExtendsUniformSplitTerms
for a shared-terms dictionary, with all the fields of a term in the same block line.A block-based terms index and dictionary based on the Uniform Split technique, and sharing all the fields terms in the same dictionary, with all the fields of a term in the same block line.ExtendsUniformSplitTermsWriter
by sharing all the fields terms in the same dictionary and by writing all the fields of a term in the same block line.Builds aFieldMetadata
that is the union of multipleFieldMetadata
.