Package org.apache.lucene.facet
Class RandomSamplingFacetsCollector
java.lang.Object
org.apache.lucene.search.SimpleCollector
org.apache.lucene.facet.FacetsCollector
org.apache.lucene.facet.RandomSamplingFacetsCollector
- All Implemented Interfaces:
Collector
,LeafCollector
Collects hits for subsequent faceting, using sampling if needed. Once you've run a search and
collect hits into this, instantiate one of the
Facets
subclasses to do the facet
counting. Note that this collector does not collect the scores of matching docs (i.e. FacetsCollector.MatchingDocs.scores
) is null
.
If you require the original set of hits, you can call getOriginalMatchingDocs()
.
Also, since the counts of the top-facets is based on the sampled set, you can amortize the counts
by calling amortizeFacetCounts(org.apache.lucene.facet.FacetResult, org.apache.lucene.facet.FacetsConfig, org.apache.lucene.search.IndexSearcher)
.
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.lucene.facet.FacetsCollector
FacetsCollector.MatchingDocs
-
Constructor Summary
ConstructorDescriptionRandomSamplingFacetsCollector
(int sampleSize) Constructor with the given sample size and default seed.RandomSamplingFacetsCollector
(int sampleSize, long seed) Constructor with the given sample size and seed. -
Method Summary
Modifier and TypeMethodDescriptionamortizeFacetCounts
(FacetResult res, FacetsConfig config, IndexSearcher searcher) Note: if you use a countingFacets
implementation, you can amortize the sampled counts by calling this method.createManager
(int sampleSize, long seed) Creates aCollectorManager
for concurrent random sampling throughRandomSamplingFacetsCollector
Returns the sampled list of the matching documents.Returns the original matching documents.double
Returns the sampling rate that was used.Methods inherited from class org.apache.lucene.facet.FacetsCollector
collect, doSetNextReader, finish, getKeepScores, scoreMode, search, search, search, searchAfter, searchAfter, searchAfter, setScorer
Methods inherited from class org.apache.lucene.search.SimpleCollector
getLeafCollector
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.apache.lucene.search.LeafCollector
collect, competitiveIterator
-
Constructor Details
-
RandomSamplingFacetsCollector
public RandomSamplingFacetsCollector(int sampleSize) Constructor with the given sample size and default seed.- See Also:
-
RandomSamplingFacetsCollector
public RandomSamplingFacetsCollector(int sampleSize, long seed) Constructor with the given sample size and seed.- Parameters:
sampleSize
- The preferred sample size. If the number of hits is greater than the size, sampling will be done using a sample ratio of sampling size / totalN. For example: 1000 hits, sample size = 10 results in samplingRatio of 0.01. If the number of hits is lower, no sampling is done at allseed
- The random seed. If0
then a seed will be chosen for you.
-
-
Method Details
-
getMatchingDocs
Returns the sampled list of the matching documents. Note that aFacetsCollector.MatchingDocs
instance is returned per segment, even if no hits from that segment are included in the sampled set.Note: One or more of the MatchingDocs might be empty (not containing any hits) as result of sampling.
Note:
MatchingDocs.totalHits
is copied from the original MatchingDocs, scores is set tonull
- Overrides:
getMatchingDocs
in classFacetsCollector
-
getOriginalMatchingDocs
Returns the original matching documents. -
amortizeFacetCounts
public FacetResult amortizeFacetCounts(FacetResult res, FacetsConfig config, IndexSearcher searcher) throws IOException Note: if you use a countingFacets
implementation, you can amortize the sampled counts by calling this method. Uses theFacetsConfig
and theIndexSearcher
to determine the upper bound for each facet value.- Throws:
IOException
-
getSamplingRate
public double getSamplingRate()Returns the sampling rate that was used. -
createManager
public static CollectorManager<RandomSamplingFacetsCollector,RandomSamplingFacetsCollector> createManager(int sampleSize, long seed) Creates aCollectorManager
for concurrent random sampling throughRandomSamplingFacetsCollector
-