org.apache.lucene.search.similarities
Class LMSimilarity

java.lang.Object
  extended by org.apache.lucene.search.similarities.Similarity
      extended by org.apache.lucene.search.similarities.SimilarityBase
          extended by org.apache.lucene.search.similarities.LMSimilarity
Direct Known Subclasses:
LMDirichletSimilarity, LMJelinekMercerSimilarity

public abstract class LMSimilarity
extends SimilarityBase

Abstract superclass for language modeling Similarities. The following inner types are introduced:

WARNING: This API is experimental and might change in incompatible ways in the next release.

Nested Class Summary
static interface LMSimilarity.CollectionModel
          A strategy for computing the collection language model.
static class LMSimilarity.DefaultCollectionModel
          Models p(w|C) as the number of occurrences of the term in the collection, divided by the total number of tokens + 1.
static class LMSimilarity.LMStats
          Stores the collection distribution of the current term.
 
Nested classes/interfaces inherited from class org.apache.lucene.search.similarities.Similarity
Similarity.ExactSimScorer, Similarity.SimWeight, Similarity.SloppySimScorer
 
Field Summary
protected  LMSimilarity.CollectionModel collectionModel
          The collection model.
 
Fields inherited from class org.apache.lucene.search.similarities.SimilarityBase
discountOverlaps
 
Constructor Summary
LMSimilarity()
          Creates a new instance with the default collection language model.
LMSimilarity(LMSimilarity.CollectionModel collectionModel)
          Creates a new instance with the specified collection language model.
 
Method Summary
protected  void explain(Explanation expl, BasicStats stats, int doc, float freq, float docLen)
          Subclasses should implement this method to explain the score.
protected  void fillBasicStats(BasicStats stats, CollectionStatistics collectionStats, TermStatistics termStats)
          Computes the collection probability of the current term in addition to the usual statistics.
abstract  String getName()
          Returns the name of the LM method.
protected  BasicStats newStats(String field, float queryBoost)
          Factory method to return a custom stats object
 String toString()
          Returns the name of the LM method.
 
Methods inherited from class org.apache.lucene.search.similarities.SimilarityBase
computeNorm, computeWeight, decodeNormValue, encodeNormValue, exactSimScorer, explain, getDiscountOverlaps, log2, score, setDiscountOverlaps, sloppySimScorer
 
Methods inherited from class org.apache.lucene.search.similarities.Similarity
coord, queryNorm
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

collectionModel

protected final LMSimilarity.CollectionModel collectionModel
The collection model.

Constructor Detail

LMSimilarity

public LMSimilarity(LMSimilarity.CollectionModel collectionModel)
Creates a new instance with the specified collection language model.


LMSimilarity

public LMSimilarity()
Creates a new instance with the default collection language model.

Method Detail

newStats

protected BasicStats newStats(String field,
                              float queryBoost)
Description copied from class: SimilarityBase
Factory method to return a custom stats object

Overrides:
newStats in class SimilarityBase

fillBasicStats

protected void fillBasicStats(BasicStats stats,
                              CollectionStatistics collectionStats,
                              TermStatistics termStats)
Computes the collection probability of the current term in addition to the usual statistics.

Overrides:
fillBasicStats in class SimilarityBase

explain

protected void explain(Explanation expl,
                       BasicStats stats,
                       int doc,
                       float freq,
                       float docLen)
Description copied from class: SimilarityBase
Subclasses should implement this method to explain the score. expl already contains the score, the name of the class and the doc id, as well as the term frequency and its explanation; subclasses can add additional clauses to explain details of their scoring formulae.

The default implementation does nothing.

Overrides:
explain in class SimilarityBase
Parameters:
expl - the explanation to extend with details.
stats - the corpus level statistics.
doc - the document id.
freq - the term frequency.
docLen - the document length.

getName

public abstract String getName()
Returns the name of the LM method. The values of the parameters should be included as well.

Used in toString()

.


toString

public String toString()
Returns the name of the LM method. If a custom collection model strategy is used, its name is included as well.

Specified by:
toString in class SimilarityBase
See Also:
getName(), LMSimilarity.CollectionModel.getName(), LMSimilarity.DefaultCollectionModel


Copyright © 2000-2013 Apache Software Foundation. All Rights Reserved.