public class DefaultSimilarity extends TFIDFSimilarity
Similarity.ExactSimScorer, Similarity.SimWeight, Similarity.SloppySimScorer
Modifier and Type | Field and Description |
---|---|
protected boolean |
discountOverlaps
True if overlap tokens (tokens with a position of increment of zero) are
discounted from the document's length.
|
Constructor and Description |
---|
DefaultSimilarity()
Sole constructor: parameter-free
|
Modifier and Type | Method and Description |
---|---|
float |
coord(int overlap,
int maxOverlap)
Implemented as
overlap / maxOverlap . |
boolean |
getDiscountOverlaps()
Returns true if overlap tokens are discounted from the document's length.
|
float |
idf(long docFreq,
long numDocs)
Implemented as
log(numDocs/(docFreq+1)) + 1 . |
float |
lengthNorm(FieldInvertState state)
Implemented as
state.getBoost()*lengthNorm(numTerms) , where
numTerms is FieldInvertState.getLength() if setDiscountOverlaps(boolean) is false, else it's FieldInvertState.getLength() - FieldInvertState.getNumOverlap() . |
float |
queryNorm(float sumOfSquaredWeights)
Implemented as
1/sqrt(sumOfSquaredWeights) . |
float |
scorePayload(int doc,
int start,
int end,
BytesRef payload)
The default implementation returns
1 |
void |
setDiscountOverlaps(boolean v)
Determines whether overlap tokens (Tokens with
0 position increment) are ignored when computing
norm.
|
float |
sloppyFreq(int distance)
Implemented as
1 / (distance + 1) . |
float |
tf(float freq)
Implemented as
sqrt(freq) . |
String |
toString() |
computeNorm, computeWeight, decodeNormValue, encodeNormValue, exactSimScorer, idfExplain, idfExplain, sloppySimScorer, tf
protected boolean discountOverlaps
public float coord(int overlap, int maxOverlap)
overlap / maxOverlap
.coord
in class TFIDFSimilarity
overlap
- the number of query terms matched in the documentmaxOverlap
- the total number of terms in the querypublic float queryNorm(float sumOfSquaredWeights)
1/sqrt(sumOfSquaredWeights)
.queryNorm
in class TFIDFSimilarity
sumOfSquaredWeights
- the sum of the squares of query term weightspublic float lengthNorm(FieldInvertState state)
state.getBoost()*lengthNorm(numTerms)
, where
numTerms
is FieldInvertState.getLength()
if setDiscountOverlaps(boolean)
is false, else it's FieldInvertState.getLength()
- FieldInvertState.getNumOverlap()
.lengthNorm
in class TFIDFSimilarity
state
- statistics of the current field (such as length, boost, etc)public float tf(float freq)
sqrt(freq)
.tf
in class TFIDFSimilarity
freq
- the frequency of a term within a documentpublic float sloppyFreq(int distance)
1 / (distance + 1)
.sloppyFreq
in class TFIDFSimilarity
distance
- the edit distance of this sloppy phrase matchPhraseQuery.setSlop(int)
public float scorePayload(int doc, int start, int end, BytesRef payload)
1
scorePayload
in class TFIDFSimilarity
doc
- The docId currently being scored.start
- The start position of the payloadend
- The end position of the payloadpayload
- The payload byte array to be scoredpublic float idf(long docFreq, long numDocs)
log(numDocs/(docFreq+1)) + 1
.idf
in class TFIDFSimilarity
docFreq
- the number of documents which contain the termnumDocs
- the total number of documents in the collectionpublic void setDiscountOverlaps(boolean v)
TFIDFSimilarity.computeNorm(org.apache.lucene.index.FieldInvertState)
public boolean getDiscountOverlaps()
setDiscountOverlaps(boolean)
Copyright © 2000-2013 Apache Software Foundation. All Rights Reserved.