org.apache.lucene.search
Class DefaultSimilarity

java.lang.Object
  extended by org.apache.lucene.search.Similarity
      extended by org.apache.lucene.search.DefaultSimilarity
All Implemented Interfaces:
Serializable
Direct Known Subclasses:
SweetSpotSimilarity

public class DefaultSimilarity
extends Similarity

Expert: Default scoring implementation.

See Also:
Serialized Form

Field Summary
protected  boolean discountOverlaps
           
 
Fields inherited from class org.apache.lucene.search.Similarity
NO_DOC_ID_PROVIDED
 
Constructor Summary
DefaultSimilarity()
           
 
Method Summary
 float computeNorm(String field, FieldInvertState state)
          Implemented as state.getBoost()*lengthNorm(numTerms), where numTerms is FieldInvertState.getLength() if setDiscountOverlaps(boolean) is false, else it's FieldInvertState.getLength() - FieldInvertState.getNumOverlap().
 float coord(int overlap, int maxOverlap)
          Implemented as overlap / maxOverlap.
 boolean getDiscountOverlaps()
           
 float idf(int docFreq, int numDocs)
          Implemented as log(numDocs/(docFreq+1)) + 1.
 float lengthNorm(String fieldName, int numTerms)
          Implemented as 1/sqrt(numTerms).
 float queryNorm(float sumOfSquaredWeights)
          Implemented as 1/sqrt(sumOfSquaredWeights).
 void setDiscountOverlaps(boolean v)
          Determines whether overlap tokens (Tokens with 0 position increment) are ignored when computing norm.
 float sloppyFreq(int distance)
          Implemented as 1 / (distance + 1).
 float tf(float freq)
          Implemented as sqrt(freq).
 
Methods inherited from class org.apache.lucene.search.Similarity
decodeNorm, encodeNorm, getDefault, getNormDecoder, idf, idf, idfExplain, idfExplain, scorePayload, scorePayload, setDefault, tf
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

discountOverlaps

protected boolean discountOverlaps
Constructor Detail

DefaultSimilarity

public DefaultSimilarity()
Method Detail

computeNorm

public float computeNorm(String field,
                         FieldInvertState state)
Implemented as state.getBoost()*lengthNorm(numTerms), where numTerms is FieldInvertState.getLength() if setDiscountOverlaps(boolean) is false, else it's FieldInvertState.getLength() - FieldInvertState.getNumOverlap().

WARNING: This API is new and experimental, and may suddenly change.

Overrides:
computeNorm in class Similarity
Parameters:
field - field name
state - current processing state for this field
Returns:
the calculated float norm

lengthNorm

public float lengthNorm(String fieldName,
                        int numTerms)
Implemented as 1/sqrt(numTerms).

Specified by:
lengthNorm in class Similarity
Parameters:
fieldName - the name of the field
numTerms - the total number of tokens contained in fields named fieldName of doc.
Returns:
a normalization factor for hits on this field of this document
See Also:
AbstractField.setBoost(float)

queryNorm

public float queryNorm(float sumOfSquaredWeights)
Implemented as 1/sqrt(sumOfSquaredWeights).

Specified by:
queryNorm in class Similarity
Parameters:
sumOfSquaredWeights - the sum of the squares of query term weights
Returns:
a normalization factor for query weights

tf

public float tf(float freq)
Implemented as sqrt(freq).

Specified by:
tf in class Similarity
Parameters:
freq - the frequency of a term within a document
Returns:
a score factor based on a term's within-document frequency

sloppyFreq

public float sloppyFreq(int distance)
Implemented as 1 / (distance + 1).

Specified by:
sloppyFreq in class Similarity
Parameters:
distance - the edit distance of this sloppy phrase match
Returns:
the frequency increment for this match
See Also:
PhraseQuery.setSlop(int)

idf

public float idf(int docFreq,
                 int numDocs)
Implemented as log(numDocs/(docFreq+1)) + 1.

Specified by:
idf in class Similarity
Parameters:
docFreq - the number of documents which contain the term
numDocs - the total number of documents in the collection
Returns:
a score factor based on the term's document frequency

coord

public float coord(int overlap,
                   int maxOverlap)
Implemented as overlap / maxOverlap.

Specified by:
coord in class Similarity
Parameters:
overlap - the number of query terms matched in the document
maxOverlap - the total number of terms in the query
Returns:
a score factor based on term overlap with the query

setDiscountOverlaps

public void setDiscountOverlaps(boolean v)
Determines whether overlap tokens (Tokens with 0 position increment) are ignored when computing norm. By default this is false, meaning overlap tokens are counted just like non-overlap tokens.

WARNING: This API is new and experimental, and may suddenly change.

See Also:
computeNorm(java.lang.String, org.apache.lucene.index.FieldInvertState)

getDiscountOverlaps

public boolean getDiscountOverlaps()
See Also:
setDiscountOverlaps(boolean)


Copyright © 2000-2010 Apache Software Foundation. All Rights Reserved.