Class RandomSamplingFacetsCollector

All Implemented Interfaces:
Collector, LeafCollector

public class RandomSamplingFacetsCollector extends FacetsCollector
Collects hits for subsequent faceting, using sampling if needed. Once you've run a search and collect hits into this, instantiate one of the Facets subclasses to do the facet counting. Note that this collector does not collect the scores of matching docs (i.e. FacetsCollector.MatchingDocs.scores) is null.

If you require the original set of hits, you can call getOriginalMatchingDocs(). Also, since the counts of the top-facets is based on the sampled set, you can amortize the counts by calling amortizeFacetCounts(org.apache.lucene.facet.FacetResult, org.apache.lucene.facet.FacetsConfig, org.apache.lucene.search.IndexSearcher).

  • Constructor Details

    • RandomSamplingFacetsCollector

      public RandomSamplingFacetsCollector(int sampleSize)
      Constructor with the given sample size and default seed.
      See Also:
    • RandomSamplingFacetsCollector

      public RandomSamplingFacetsCollector(int sampleSize, long seed)
      Constructor with the given sample size and seed.
      Parameters:
      sampleSize - The preferred sample size. If the number of hits is greater than the size, sampling will be done using a sample ratio of sampling size / totalN. For example: 1000 hits, sample size = 10 results in samplingRatio of 0.01. If the number of hits is lower, no sampling is done at all
      seed - The random seed. If 0 then a seed will be chosen for you.
  • Method Details

    • getMatchingDocs

      public List<FacetsCollector.MatchingDocs> getMatchingDocs()
      Returns the sampled list of the matching documents. Note that a FacetsCollector.MatchingDocs instance is returned per segment, even if no hits from that segment are included in the sampled set.

      Note: One or more of the MatchingDocs might be empty (not containing any hits) as result of sampling.

      Note: MatchingDocs.totalHits is copied from the original MatchingDocs, scores is set to null

      Overrides:
      getMatchingDocs in class FacetsCollector
    • getOriginalMatchingDocs

      public List<FacetsCollector.MatchingDocs> getOriginalMatchingDocs()
      Returns the original matching documents.
    • amortizeFacetCounts

      public FacetResult amortizeFacetCounts(FacetResult res, FacetsConfig config, IndexSearcher searcher) throws IOException
      Note: if you use a counting Facets implementation, you can amortize the sampled counts by calling this method. Uses the FacetsConfig and the IndexSearcher to determine the upper bound for each facet value.
      Throws:
      IOException
    • getSamplingRate

      public double getSamplingRate()
      Returns the sampling rate that was used.