public class StandardFacetsAccumulator extends FacetsAccumulator
FacetsAccumulator
, utilizing partitions to save on memory.
Why partitions? Because if there are say 100M categories out of which only top K are required, we must first compute value for all 100M categories (going over all documents) and only then could we select top K. This is made easier on memory by working in partitions of distinct categories: Once a values for a partition are found, we take the top K for that partition and work on the next partition, them merge the top K of both, and so forth, thereby computing top K with RAM needs for the size of a single partition rather than for the size of all the 100M categories.
Decision on partitions size is done at indexing time, and the facet information for each partition is maintained separately.
Implementation detail: Since facets information of each partition is maintained in a separate "category list", we can be more efficient at search time, because only the facet info for a single partition need to be read while processing that partition.
Modifier and Type | Field and Description |
---|---|
static double |
DEFAULT_COMPLEMENT_THRESHOLD
Default threshold for using the complements optimization.
|
static double |
DISABLE_COMPLEMENT
Passing this to
setComplementThreshold(double) will disable using complement optimization. |
static double |
FORCE_COMPLEMENT
Passing this to
setComplementThreshold(double) will force using complement optimization. |
protected boolean |
isUsingComplements |
protected int |
maxPartitions |
protected int |
partitionSize |
facetArrays, indexReader, searchParams, taxonomyReader
Constructor and Description |
---|
StandardFacetsAccumulator(FacetSearchParams searchParams,
IndexReader indexReader,
TaxonomyReader taxonomyReader) |
StandardFacetsAccumulator(FacetSearchParams searchParams,
IndexReader indexReader,
TaxonomyReader taxonomyReader,
FacetArrays facetArrays) |
Modifier and Type | Method and Description |
---|---|
List<FacetResult> |
accumulate(List<FacetsCollector.MatchingDocs> matchingDocs)
Used by
FacetsCollector to build the list of facet results that match the facet requests that were
given in the constructor. |
List<FacetResult> |
accumulate(ScoredDocIDs docids) |
protected ScoredDocIDs |
actualDocsToAccumulate(ScoredDocIDs docids)
Set the actual set of documents over which accumulation should take place.
|
protected PartitionsFacetResultsHandler |
createFacetResultsHandler(FacetRequest fr)
Creates a
FacetResultsHandler that matches the given
FacetRequest . |
protected HashMap<CategoryListIterator,Aggregator> |
getCategoryListMap(FacetArrays facetArrays,
int partition)
|
double |
getComplementThreshold()
Returns the complement threshold.
|
protected double |
getTotalCountsFactor()
Expert: factor by which counts should be multiplied when initializing
the count arrays from total counts.
|
boolean |
isUsingComplements()
Returns true if complements are enabled.
|
protected boolean |
mayComplement()
check if all requests are complementable
|
void |
setComplementThreshold(double complementThreshold)
Set the complement threshold.
|
protected boolean |
shouldComplement(ScoredDocIDs docids)
Check if it is worth to use complements
|
create, emptyResult, getAggregator, getCategoryLists
public static final double DEFAULT_COMPLEMENT_THRESHOLD
public static final double DISABLE_COMPLEMENT
setComplementThreshold(double)
will disable using complement optimization.public static final double FORCE_COMPLEMENT
setComplementThreshold(double)
will force using complement optimization.protected int partitionSize
protected int maxPartitions
protected boolean isUsingComplements
public StandardFacetsAccumulator(FacetSearchParams searchParams, IndexReader indexReader, TaxonomyReader taxonomyReader)
public StandardFacetsAccumulator(FacetSearchParams searchParams, IndexReader indexReader, TaxonomyReader taxonomyReader, FacetArrays facetArrays)
public List<FacetResult> accumulate(ScoredDocIDs docids) throws IOException
IOException
protected boolean mayComplement()
protected PartitionsFacetResultsHandler createFacetResultsHandler(FacetRequest fr)
FacetsAccumulator
FacetResultsHandler
that matches the given
FacetRequest
.createFacetResultsHandler
in class FacetsAccumulator
protected ScoredDocIDs actualDocsToAccumulate(ScoredDocIDs docids) throws IOException
Allows to override the set of documents to accumulate for. Invoked just before actual accumulating starts. From this point that set of documents remains unmodified. Default implementation just returns the input unchanged.
docids
- candidate documents to accumulate forIOException
protected boolean shouldComplement(ScoredDocIDs docids)
protected double getTotalCountsFactor()
protected HashMap<CategoryListIterator,Aggregator> getCategoryListMap(FacetArrays facetArrays, int partition) throws IOException
Aggregator
and a CategoryListIterator
for each
and every FacetRequest
. Generating a map, matching each
categoryListIterator to its matching aggregator.
If two CategoryListIterators are served by the same aggregator, a single aggregator is returned for both. NOTE: If a given category list iterator is needed with two different aggregators (e.g counting and association) - an exception is thrown as this functionality is not supported at this time.
IOException
public List<FacetResult> accumulate(List<FacetsCollector.MatchingDocs> matchingDocs) throws IOException
FacetsAccumulator
FacetsCollector
to build the list of facet results
that match the facet requests
that were
given in the constructor.accumulate
in class FacetsAccumulator
matchingDocs
- the documents that matched the query, per-segment.IOException
public double getComplementThreshold()
setComplementThreshold(double)
public void setComplementThreshold(double complementThreshold)
For the default settings see DEFAULT_COMPLEMENT_THRESHOLD
.
To forcing complements in all cases pass FORCE_COMPLEMENT
.
This is mostly useful for testing purposes, as forcing complements when only
tiny fraction of available documents match the query does not make sense and
would incur performance degradations.
To disable complements pass DISABLE_COMPLEMENT
.
complementThreshold
- the complement threshold to setgetComplementThreshold()
public boolean isUsingComplements()
Copyright © 2000-2013 Apache Software Foundation. All Rights Reserved.