Class HistogramCollectorManager
- All Implemented Interfaces:
CollectorManager<org.apache.lucene.sandbox.facet.plain.histograms.HistogramCollector,
LongIntHashMap>
CollectorManager
that computes a histogram of the distribution of the values of a field.
It takes an bucketWidth
as a parameter and counts the number of documents that fall
into intervals [0, bucketWidth), [bucketWidth, 2*bucketWidth), etc. The keys of the returned
LongIntHashMap
identify these intervals as the quotient of the integer division by
bucketWidth
. Said otherwise, a key equal to k
maps to values in the interval [k *
bucketWidth, (k+1) * bucketWidth)
.
This implementation is optimized for the case when field
is part of the index sort and
has a skip index
.
Note: this collector is inspired from "YU, Muzhi, LIN, Zhaoxiang, SUN, Jinan, et al. TencentCLS: the cloud log service with high query performances. Proceedings of the VLDB Endowment, 2022, vol. 15, no 12, p. 3472-3482.", where the authors describe how they run "histogram queries" by sorting the index by timestamp and pre-computing ranges of doc IDs for every possible bucket.
-
Constructor Summary
ConstructorsConstructorDescriptionHistogramCollectorManager
(String field, long bucketWidth) Compute a histogram of the distribution of the values of the givenfield
according to the givenbucketWidth
.HistogramCollectorManager
(String field, long bucketWidth, int maxBuckets) Expert constructor. -
Method Summary
Modifier and TypeMethodDescriptionorg.apache.lucene.sandbox.facet.plain.histograms.HistogramCollector
reduce
(Collection<org.apache.lucene.sandbox.facet.plain.histograms.HistogramCollector> collectors)
-
Constructor Details
-
HistogramCollectorManager
Compute a histogram of the distribution of the values of the givenfield
according to the givenbucketWidth
. This configures a maximum number of buckets equal to the default of 1024. -
HistogramCollectorManager
Expert constructor.- Parameters:
maxBuckets
- Max allowed number of buckets. Note that this is checked at runtime and on a best-effort basis.
-
-
Method Details
-
newCollector
public org.apache.lucene.sandbox.facet.plain.histograms.HistogramCollector newCollector() throws IOException- Specified by:
newCollector
in interfaceCollectorManager<org.apache.lucene.sandbox.facet.plain.histograms.HistogramCollector,
LongIntHashMap> - Throws:
IOException
-
reduce
public LongIntHashMap reduce(Collection<org.apache.lucene.sandbox.facet.plain.histograms.HistogramCollector> collectors) throws IOException - Specified by:
reduce
in interfaceCollectorManager<org.apache.lucene.sandbox.facet.plain.histograms.HistogramCollector,
LongIntHashMap> - Throws:
IOException
-