Package org.apache.lucene.facet
Class RandomSamplingFacetsCollector
- java.lang.Object
-
- org.apache.lucene.search.SimpleCollector
-
- org.apache.lucene.facet.FacetsCollector
-
- org.apache.lucene.facet.RandomSamplingFacetsCollector
-
- All Implemented Interfaces:
Collector
,LeafCollector
public class RandomSamplingFacetsCollector extends FacetsCollector
Collects hits for subsequent faceting, using sampling if needed. Once you've run a search and collect hits into this, instantiate one of theFacets
subclasses to do the facet counting. Note that this collector does not collect the scores of matching docs (i.e.FacetsCollector.MatchingDocs.scores
) isnull
.If you require the original set of hits, you can call
getOriginalMatchingDocs()
. Also, since the counts of the top-facets is based on the sampled set, you can amortize the counts by callingamortizeFacetCounts(org.apache.lucene.facet.FacetResult, org.apache.lucene.facet.FacetsConfig, org.apache.lucene.search.IndexSearcher)
.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.lucene.facet.FacetsCollector
FacetsCollector.MatchingDocs
-
-
Constructor Summary
Constructors Constructor Description RandomSamplingFacetsCollector(int sampleSize)
Constructor with the given sample size and default seed.RandomSamplingFacetsCollector(int sampleSize, long seed)
Constructor with the given sample size and seed.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description FacetResult
amortizeFacetCounts(FacetResult res, FacetsConfig config, IndexSearcher searcher)
Note: if you use a countingFacets
implementation, you can amortize the sampled counts by calling this method.static CollectorManager<RandomSamplingFacetsCollector,RandomSamplingFacetsCollector>
createManager(int sampleSize, long seed)
Creates aCollectorManager
for concurrent random sampling throughRandomSamplingFacetsCollector
List<FacetsCollector.MatchingDocs>
getMatchingDocs()
Returns the sampled list of the matching documents.List<FacetsCollector.MatchingDocs>
getOriginalMatchingDocs()
Returns the original matching documents.double
getSamplingRate()
Returns the sampling rate that was used.-
Methods inherited from class org.apache.lucene.facet.FacetsCollector
collect, doSetNextReader, getKeepScores, scoreMode, search, search, search, searchAfter, searchAfter, searchAfter, setScorer
-
Methods inherited from class org.apache.lucene.search.SimpleCollector
getLeafCollector
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.lucene.search.LeafCollector
competitiveIterator
-
-
-
-
Constructor Detail
-
RandomSamplingFacetsCollector
public RandomSamplingFacetsCollector(int sampleSize)
Constructor with the given sample size and default seed.- See Also:
RandomSamplingFacetsCollector(int, long)
-
RandomSamplingFacetsCollector
public RandomSamplingFacetsCollector(int sampleSize, long seed)
Constructor with the given sample size and seed.- Parameters:
sampleSize
- The preferred sample size. If the number of hits is greater than the size, sampling will be done using a sample ratio of sampling size / totalN. For example: 1000 hits, sample size = 10 results in samplingRatio of 0.01. If the number of hits is lower, no sampling is done at allseed
- The random seed. If0
then a seed will be chosen for you.
-
-
Method Detail
-
getMatchingDocs
public List<FacetsCollector.MatchingDocs> getMatchingDocs()
Returns the sampled list of the matching documents. Note that aFacetsCollector.MatchingDocs
instance is returned per segment, even if no hits from that segment are included in the sampled set.Note: One or more of the MatchingDocs might be empty (not containing any hits) as result of sampling.
Note:
MatchingDocs.totalHits
is copied from the original MatchingDocs, scores is set tonull
- Overrides:
getMatchingDocs
in classFacetsCollector
-
getOriginalMatchingDocs
public List<FacetsCollector.MatchingDocs> getOriginalMatchingDocs()
Returns the original matching documents.
-
amortizeFacetCounts
public FacetResult amortizeFacetCounts(FacetResult res, FacetsConfig config, IndexSearcher searcher) throws IOException
Note: if you use a countingFacets
implementation, you can amortize the sampled counts by calling this method. Uses theFacetsConfig
and theIndexSearcher
to determine the upper bound for each facet value.- Throws:
IOException
-
getSamplingRate
public double getSamplingRate()
Returns the sampling rate that was used.
-
createManager
public static CollectorManager<RandomSamplingFacetsCollector,RandomSamplingFacetsCollector> createManager(int sampleSize, long seed)
Creates aCollectorManager
for concurrent random sampling throughRandomSamplingFacetsCollector
-
-