public class DefaultSimilarity extends TFIDFSimilarity
Similarity.ExactSimScorer, Similarity.SimWeight, Similarity.SloppySimScorer| Modifier and Type | Field and Description |
|---|---|
protected boolean |
discountOverlaps
True if overlap tokens (tokens with a position of increment of zero) are
discounted from the document's length.
|
| Constructor and Description |
|---|
DefaultSimilarity()
Sole constructor: parameter-free
|
| Modifier and Type | Method and Description |
|---|---|
float |
coord(int overlap,
int maxOverlap)
Implemented as
overlap / maxOverlap. |
boolean |
getDiscountOverlaps()
Returns true if overlap tokens are discounted from the document's length.
|
float |
idf(long docFreq,
long numDocs)
Implemented as
log(numDocs/(docFreq+1)) + 1. |
float |
lengthNorm(FieldInvertState state)
Implemented as
state.getBoost()*lengthNorm(numTerms), where
numTerms is FieldInvertState.getLength() if setDiscountOverlaps(boolean) is false, else it's FieldInvertState.getLength() - FieldInvertState.getNumOverlap(). |
float |
queryNorm(float sumOfSquaredWeights)
Implemented as
1/sqrt(sumOfSquaredWeights). |
float |
scorePayload(int doc,
int start,
int end,
BytesRef payload)
The default implementation returns
1 |
void |
setDiscountOverlaps(boolean v)
Determines whether overlap tokens (Tokens with
0 position increment) are ignored when computing
norm.
|
float |
sloppyFreq(int distance)
Implemented as
1 / (distance + 1). |
float |
tf(float freq)
Implemented as
sqrt(freq). |
String |
toString() |
computeNorm, computeWeight, decodeNormValue, encodeNormValue, exactSimScorer, idfExplain, idfExplain, sloppySimScorer, tfprotected boolean discountOverlaps
public float coord(int overlap,
int maxOverlap)
overlap / maxOverlap.coord in class TFIDFSimilarityoverlap - the number of query terms matched in the documentmaxOverlap - the total number of terms in the querypublic float queryNorm(float sumOfSquaredWeights)
1/sqrt(sumOfSquaredWeights).queryNorm in class TFIDFSimilaritysumOfSquaredWeights - the sum of the squares of query term weightspublic float lengthNorm(FieldInvertState state)
state.getBoost()*lengthNorm(numTerms), where
numTerms is FieldInvertState.getLength() if setDiscountOverlaps(boolean) is false, else it's FieldInvertState.getLength() - FieldInvertState.getNumOverlap().lengthNorm in class TFIDFSimilaritystate - statistics of the current field (such as length, boost, etc)public float tf(float freq)
sqrt(freq).tf in class TFIDFSimilarityfreq - the frequency of a term within a documentpublic float sloppyFreq(int distance)
1 / (distance + 1).sloppyFreq in class TFIDFSimilaritydistance - the edit distance of this sloppy phrase matchPhraseQuery.setSlop(int)public float scorePayload(int doc,
int start,
int end,
BytesRef payload)
1scorePayload in class TFIDFSimilaritydoc - The docId currently being scored.start - The start position of the payloadend - The end position of the payloadpayload - The payload byte array to be scoredpublic float idf(long docFreq,
long numDocs)
log(numDocs/(docFreq+1)) + 1.idf in class TFIDFSimilaritydocFreq - the number of documents which contain the termnumDocs - the total number of documents in the collectionpublic void setDiscountOverlaps(boolean v)
TFIDFSimilarity.computeNorm(org.apache.lucene.index.FieldInvertState)public boolean getDiscountOverlaps()
setDiscountOverlaps(boolean)Copyright © 2000-2013 Apache Software Foundation. All Rights Reserved.