|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.lucene.search.similarities.Similarity org.apache.lucene.search.similarities.TFIDFSimilarity org.apache.lucene.search.similarities.DefaultSimilarity org.apache.lucene.misc.SweetSpotSimilarity
public class SweetSpotSimilarity
A similarity with a lengthNorm that provides for a "plateau" of equally good lengths, and tf helper functions.
For lengthNorm, A min/max can be specified to define the plateau of lengths that should all have a norm of 1.0. Below the min, and above the max the lengthNorm drops off in a sqrt function.
For tf, baselineTf and hyperbolicTf functions are provided, which subclasses can choose between.
Nested Class Summary |
---|
Nested classes/interfaces inherited from class org.apache.lucene.search.similarities.Similarity |
---|
Similarity.SimScorer, Similarity.SimWeight |
Field Summary |
---|
Fields inherited from class org.apache.lucene.search.similarities.DefaultSimilarity |
---|
discountOverlaps |
Constructor Summary | |
---|---|
SweetSpotSimilarity()
|
Method Summary | |
---|---|
float |
baselineTf(float freq)
Implemented as:
(x <= min) ? base : sqrt(x+(base**2)-min)
...but with a special case check for 0. |
float |
computeLengthNorm(int numTerms)
Implemented as:
1/sqrt( steepness * (abs(x-min) + abs(x-max) - (max-min)) + 1 )
. |
float |
hyperbolicTf(float freq)
Uses a hyperbolic tangent function that allows for a hard max... |
float |
lengthNorm(FieldInvertState state)
Implemented as state.getBoost() *
computeLengthNorm(numTokens) where
numTokens does not count overlap tokens if
discountOverlaps is true by default or true for this
specific field. |
void |
setBaselineTfFactors(float base,
float min)
Sets the baseline and minimum function variables for baselineTf |
void |
setHyperbolicTfFactors(float min,
float max,
double base,
float xoffset)
Sets the function variables for the hyperbolicTf functions |
void |
setLengthNormFactors(int min,
int max,
float steepness,
boolean discountOverlaps)
Sets the default function variables used by lengthNorm when no field specific variables have been set. |
float |
tf(float freq)
Delegates to baselineTf |
Methods inherited from class org.apache.lucene.search.similarities.DefaultSimilarity |
---|
coord, decodeNormValue, encodeNormValue, getDiscountOverlaps, idf, queryNorm, scorePayload, setDiscountOverlaps, sloppyFreq, toString |
Methods inherited from class org.apache.lucene.search.similarities.TFIDFSimilarity |
---|
computeNorm, computeWeight, idfExplain, idfExplain, simScorer |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
public SweetSpotSimilarity()
Method Detail |
---|
public void setBaselineTfFactors(float base, float min)
baselineTf(float)
public void setHyperbolicTfFactors(float min, float max, double base, float xoffset)
min
- the minimum tf value to ever be returned (default: 0.0)max
- the maximum tf value to ever be returned (default: 2.0)base
- the base value to be used in the exponential for the hyperbolic function (default: 1.3)xoffset
- the midpoint of the hyperbolic function (default: 10.0)hyperbolicTf(float)
public void setLengthNormFactors(int min, int max, float steepness, boolean discountOverlaps)
computeLengthNorm(int)
public float lengthNorm(FieldInvertState state)
state.getBoost() *
computeLengthNorm(numTokens)
where
numTokens does not count overlap tokens if
discountOverlaps is true by default or true for this
specific field.
lengthNorm
in class DefaultSimilarity
public float computeLengthNorm(int numTerms)
1/sqrt( steepness * (abs(x-min) + abs(x-max) - (max-min)) + 1 )
.
This degrades to 1/sqrt(x)
when min and max are both 1 and
steepness is 0.5
:TODO: potential optimization is to just flat out return 1.0f if numTerms is between min and max.
setLengthNormFactors(int, int, float, boolean)
,
An SVG visualization of this functionpublic float tf(float freq)
tf
in class DefaultSimilarity
baselineTf(float)
public float baselineTf(float freq)
(x <= min) ? base : sqrt(x+(base**2)-min)
...but with a special case check for 0.
This degrates to sqrt(x)
when min and base are both 0
setBaselineTfFactors(float, float)
,
An SVG visualization of this functionpublic float hyperbolicTf(float freq)
tf(x)=min+(max-min)/2*(((base**(x-xoffset)-base**-(x-xoffset))/(base**(x-xoffset)+base**-(x-xoffset)))+1)
This code is provided as a convenience for subclasses that want to use a hyperbolic tf function.
setHyperbolicTfFactors(float, float, double, float)
,
An SVG visualization of this function
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |