public class SweetSpotSimilarity extends DefaultSimilarity
A similarity with a lengthNorm that provides for a "plateau" of equally good lengths, and tf helper functions.
For lengthNorm, A min/max can be specified to define the plateau of lengths that should all have a norm of 1.0. Below the min, and above the max the lengthNorm drops off in a sqrt function.
For tf, baselineTf and hyperbolicTf functions are provided, which subclasses can choose between.
Similarity.SimScorer, Similarity.SimWeightdiscountOverlaps| Constructor and Description |
|---|
SweetSpotSimilarity() |
| Modifier and Type | Method and Description |
|---|---|
float |
baselineTf(float freq)
Implemented as:
(x <= min) ? base : sqrt(x+(base**2)-min)
...but with a special case check for 0. |
float |
computeLengthNorm(int numTerms)
Implemented as:
1/sqrt( steepness * (abs(x-min) + abs(x-max) - (max-min)) + 1 )
. |
float |
hyperbolicTf(float freq)
Uses a hyperbolic tangent function that allows for a hard max...
|
float |
lengthNorm(FieldInvertState state)
Implemented as
state.getBoost() *
computeLengthNorm(numTokens) where
numTokens does not count overlap tokens if
discountOverlaps is true by default or true for this
specific field. |
void |
setBaselineTfFactors(float base,
float min)
Sets the baseline and minimum function variables for baselineTf
|
void |
setHyperbolicTfFactors(float min,
float max,
double base,
float xoffset)
Sets the function variables for the hyperbolicTf functions
|
void |
setLengthNormFactors(int min,
int max,
float steepness,
boolean discountOverlaps)
Sets the default function variables used by lengthNorm when no field
specific variables have been set.
|
float |
tf(float freq)
Delegates to baselineTf
|
coord, decodeNormValue, encodeNormValue, getDiscountOverlaps, idf, queryNorm, scorePayload, setDiscountOverlaps, sloppyFreq, toStringcomputeNorm, computeWeight, idfExplain, idfExplain, simScorerpublic void setBaselineTfFactors(float base,
float min)
baselineTf(float)public void setHyperbolicTfFactors(float min,
float max,
double base,
float xoffset)
min - the minimum tf value to ever be returned (default: 0.0)max - the maximum tf value to ever be returned (default: 2.0)base - the base value to be used in the exponential for the hyperbolic function (default: 1.3)xoffset - the midpoint of the hyperbolic function (default: 10.0)hyperbolicTf(float)public void setLengthNormFactors(int min,
int max,
float steepness,
boolean discountOverlaps)
computeLengthNorm(int)public float lengthNorm(FieldInvertState state)
state.getBoost() *
computeLengthNorm(numTokens) where
numTokens does not count overlap tokens if
discountOverlaps is true by default or true for this
specific field.lengthNorm in class DefaultSimilaritypublic float computeLengthNorm(int numTerms)
1/sqrt( steepness * (abs(x-min) + abs(x-max) - (max-min)) + 1 )
.
This degrades to 1/sqrt(x) when min and max are both 1 and
steepness is 0.5
:TODO: potential optimization is to just flat out return 1.0f if numTerms is between min and max.
public float tf(float freq)
tf in class DefaultSimilaritybaselineTf(float)public float baselineTf(float freq)
(x <= min) ? base : sqrt(x+(base**2)-min)
...but with a special case check for 0.
This degrates to sqrt(x) when min and base are both 0
public float hyperbolicTf(float freq)
tf(x)=min+(max-min)/2*(((base**(x-xoffset)-base**-(x-xoffset))/(base**(x-xoffset)+base**-(x-xoffset)))+1)
This code is provided as a convenience for subclasses that want to use a hyperbolic tf function.
Copyright © 2000-2014 Apache Software Foundation. All Rights Reserved.