org.apache.lucene.sandbox.facet.plain.histograms.HistogramCollectorManager

All Implemented Interfaces:: CollectorManager<org.apache.lucene.sandbox.facet.plain.histograms.HistogramCollector,LongIntHashMap>

public final class HistogramCollectorManager extends Object implements CollectorManager<org.apache.lucene.sandbox.facet.plain.histograms.HistogramCollector,LongIntHashMap>

CollectorManager that computes a histogram of the distribution of the values of a field.

It takes an bucketWidth as a parameter and counts the number of documents that fall into intervals [0, bucketWidth), [bucketWidth, 2*bucketWidth), etc. The keys of the returned LongIntHashMap identify these intervals as the quotient of the integer division by bucketWidth. Said otherwise, a key equal to k maps to values in the interval [k * bucketWidth, (k+1) * bucketWidth).

This implementation is optimized for the case when field is part of the index sort and has a skip index.

Note: this collector is inspired from "YU, Muzhi, LIN, Zhaoxiang, SUN, Jinan, et al. TencentCLS: the cloud log service with high query performances. Proceedings of the VLDB Endowment, 2022, vol. 15, no 12, p. 3472-3482.", where the authors describe how they run "histogram queries" by sorting the index by timestamp and pre-computing ranges of doc IDs for every possible bucket.

Constructor Summary

Constructors

Constructor

Description

HistogramCollectorManager(String field, long bucketWidth)

Compute a histogram of the distribution of the values of the given field according to the given bucketWidth.

HistogramCollectorManager(String field, long bucketWidth, int maxBuckets)

Expert constructor.
Method Summary

Modifier and Type

Method

Description

org.apache.lucene.sandbox.facet.plain.histograms.HistogramCollector

newCollector()

LongIntHashMap

reduce(Collection<org.apache.lucene.sandbox.facet.plain.histograms.HistogramCollector> collectors)

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- HistogramCollectorManager
  
  public HistogramCollectorManager(String field, long bucketWidth)
  
  Compute a histogram of the distribution of the values of the given field according to the given bucketWidth. This configures a maximum number of buckets equal to the default of 1024.
- HistogramCollectorManager
  
  public HistogramCollectorManager(String field, long bucketWidth, int maxBuckets)
  
  Expert constructor.
  
  Parameters:
  
  maxBuckets - Max allowed number of buckets. Note that this is checked at runtime and on a best-effort basis.
Method Details
- newCollector
  
  public org.apache.lucene.sandbox.facet.plain.histograms.HistogramCollector newCollector() throws IOException
  
  Specified by:
  
  newCollector in interface CollectorManager<org.apache.lucene.sandbox.facet.plain.histograms.HistogramCollector,LongIntHashMap>
  
  Throws:
  
  IOException
- reduce
  
  public LongIntHashMap reduce(Collection<org.apache.lucene.sandbox.facet.plain.histograms.HistogramCollector> collectors) throws IOException
  
  Specified by:
  
  reduce in interface CollectorManager<org.apache.lucene.sandbox.facet.plain.histograms.HistogramCollector,LongIntHashMap>
  
  Throws:
  
  IOException

Class HistogramCollectorManager

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

HistogramCollectorManager

HistogramCollectorManager

Method Details

newCollector

reduce