Class Axiomatic
- java.lang.Object
-
- org.apache.lucene.search.similarities.Similarity
-
- org.apache.lucene.search.similarities.SimilarityBase
-
- org.apache.lucene.search.similarities.Axiomatic
-
- Direct Known Subclasses:
AxiomaticF1EXP
,AxiomaticF1LOG
,AxiomaticF2EXP
,AxiomaticF2LOG
,AxiomaticF3EXP
,AxiomaticF3LOG
public abstract class Axiomatic extends SimilarityBase
Axiomatic approaches for IR. From Hui Fang and Chengxiang Zhai 2005. An Exploration of Axiomatic Approaches to Information Retrieval. In Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR '05). ACM, New York, NY, USA, 480-487.There are a family of models. All of them are based on BM25, Pivoted Document Length Normalization and Language model with Dirichlet prior. Some components (e.g. Term Frequency, Inverted Document Frequency) in the original models are modified so that they follow some axiomatic constraints.
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.lucene.search.similarities.Similarity
Similarity.SimScorer, Similarity.SimWeight
-
-
Field Summary
Fields Modifier and Type Field Description protected float
k
hyperparam for the primitive weighthing functionprotected int
queryLen
the query lengthprotected float
s
hyperparam for the growth function-
Fields inherited from class org.apache.lucene.search.similarities.SimilarityBase
discountOverlaps
-
-
Constructor Summary
Constructors Constructor Description Axiomatic()
Default constructorAxiomatic(float s)
Constructor setting only s, letting k and queryLen to defaultAxiomatic(float s, int queryLen)
Constructor setting s and queryLen, letting k to defaultAxiomatic(float s, int queryLen, float k)
Constructor setting all Axiomatic hyperparameters
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description protected void
explain(List<Explanation> subs, BasicStats stats, int doc, float freq, float docLen)
Subclasses should implement this method to explain the score.protected abstract float
gamma(BasicStats stats, float freq, float docLen)
compute the gamma component (only for F3EXp and F3LOG)protected abstract float
idf(BasicStats stats, float freq, float docLen)
compute the inverted document frequency componentprotected abstract float
ln(BasicStats stats, float freq, float docLen)
compute the document length componentfloat
score(BasicStats stats, float freq, float docLen)
Scores the documentdoc
.protected abstract float
tf(BasicStats stats, float freq, float docLen)
compute the term frequency componentprotected abstract float
tfln(BasicStats stats, float freq, float docLen)
compute the mixed term frequency and document length componentabstract String
toString()
Name of the axiomatic method.-
Methods inherited from class org.apache.lucene.search.similarities.SimilarityBase
computeNorm, computeWeight, explain, fillBasicStats, getDiscountOverlaps, log2, newStats, setDiscountOverlaps, simScorer
-
-
-
-
Constructor Detail
-
Axiomatic
public Axiomatic(float s, int queryLen, float k)
Constructor setting all Axiomatic hyperparameters- Parameters:
s
- hyperparam for the growth functionqueryLen
- the query lengthk
- hyperparam for the primitive weighting function
-
Axiomatic
public Axiomatic(float s)
Constructor setting only s, letting k and queryLen to default- Parameters:
s
- hyperparam for the growth function
-
Axiomatic
public Axiomatic(float s, int queryLen)
Constructor setting s and queryLen, letting k to default- Parameters:
s
- hyperparam for the growth functionqueryLen
- the query length
-
Axiomatic
public Axiomatic()
Default constructor
-
-
Method Detail
-
score
public float score(BasicStats stats, float freq, float docLen)
Description copied from class:SimilarityBase
Scores the documentdoc
.Subclasses must apply their scoring formula in this class.
- Specified by:
score
in classSimilarityBase
- Parameters:
stats
- the corpus level statistics.freq
- the term frequency.docLen
- the document length.- Returns:
- the score.
-
explain
protected void explain(List<Explanation> subs, BasicStats stats, int doc, float freq, float docLen)
Description copied from class:SimilarityBase
Subclasses should implement this method to explain the score.expl
already contains the score, the name of the class and the doc id, as well as the term frequency and its explanation; subclasses can add additional clauses to explain details of their scoring formulae.The default implementation does nothing.
- Overrides:
explain
in classSimilarityBase
- Parameters:
subs
- the list of details of the explanation to extendstats
- the corpus level statistics.doc
- the document id.freq
- the term frequency.docLen
- the document length.
-
toString
public abstract String toString()
Name of the axiomatic method.- Specified by:
toString
in classSimilarityBase
-
tf
protected abstract float tf(BasicStats stats, float freq, float docLen)
compute the term frequency component
-
ln
protected abstract float ln(BasicStats stats, float freq, float docLen)
compute the document length component
-
tfln
protected abstract float tfln(BasicStats stats, float freq, float docLen)
compute the mixed term frequency and document length component
-
idf
protected abstract float idf(BasicStats stats, float freq, float docLen)
compute the inverted document frequency component
-
gamma
protected abstract float gamma(BasicStats stats, float freq, float docLen)
compute the gamma component (only for F3EXp and F3LOG)
-
-