public abstract class PerFieldSimilarityWrapper extends Similarity
Similarity
for different fields.
Subclasses should implement get(String)
to return an appropriate
Similarity (for example, using field-specific parameter values) for the field.
For Lucene 6, you should pass a default similarity that is used for all non field-specific methods. From Lucene 7 on, this is no longer required.
Similarity.SimScorer, Similarity.SimWeight
Modifier and Type | Field and Description |
---|---|
protected Similarity |
defaultSim
Default similarity used for query norm and coordination factors.
|
Constructor and Description |
---|
PerFieldSimilarityWrapper()
Deprecated.
specify a default similarity for non field-specific calculations.
|
PerFieldSimilarityWrapper(Similarity defaultSim)
Constructor taking a default similarity for all non-field specific calculations.
|
Modifier and Type | Method and Description |
---|---|
long |
computeNorm(FieldInvertState state)
Computes the normalization value for a field, given the accumulated
state of term processing for this field (see
FieldInvertState ). |
Similarity.SimWeight |
computeWeight(CollectionStatistics collectionStats,
TermStatistics... termStats)
Compute any collection-level weight (e.g.
|
float |
coord(int overlap,
int maxOverlap)
Hook to integrate coordinate-level matching.
|
abstract Similarity |
get(String name)
Returns a
Similarity for scoring a field. |
float |
queryNorm(float valueForNormalization)
Computes the normalization value for a query given the sum of the
normalized weights
Similarity.SimWeight.getValueForNormalization() of
each of the query terms. |
Similarity.SimScorer |
simScorer(Similarity.SimWeight weight,
LeafReaderContext context)
Creates a new
Similarity.SimScorer to score matching documents from a segment of the inverted index. |
protected final Similarity defaultSim
public PerFieldSimilarityWrapper(Similarity defaultSim)
defaultSim
- is used for all non field-specific calculations, like
queryNorm(float)
and coord(int, int)
.@Deprecated public PerFieldSimilarityWrapper()
From Lucene 7 on, this will get the default again, because coordination factors and query normalization will be removed.
public final long computeNorm(FieldInvertState state)
Similarity
FieldInvertState
).
Matches in longer fields are less precise, so implementations of this
method usually set smaller values when state.getLength()
is large,
and larger values when state.getLength()
is small.
computeNorm
in class Similarity
state
- current processing state for this fieldpublic final Similarity.SimWeight computeWeight(CollectionStatistics collectionStats, TermStatistics... termStats)
Similarity
computeWeight
in class Similarity
collectionStats
- collection-level statistics, such as the number of tokens in the collection.termStats
- term-level statistics, such as the document frequency of a term across the collection.public final Similarity.SimScorer simScorer(Similarity.SimWeight weight, LeafReaderContext context) throws IOException
Similarity
Similarity.SimScorer
to score matching documents from a segment of the inverted index.simScorer
in class Similarity
weight
- collection information from Similarity.computeWeight(CollectionStatistics, TermStatistics...)
context
- segment of the inverted index to be scored.context
IOException
- if there is a low-level I/O errorpublic final float coord(int overlap, int maxOverlap)
Similarity
By default this is disabled (returns 1
), as with
most modern models this will only skew performance, but some
implementations such as TFIDFSimilarity
override this.
coord
in class Similarity
overlap
- the number of query terms matched in the documentmaxOverlap
- the total number of terms in the querypublic final float queryNorm(float valueForNormalization)
Similarity
Similarity.SimWeight.getValueForNormalization()
of
each of the query terms. This value is passed back to the
weight (Similarity.SimWeight.normalize(float, float)
of each query
term, to provide a hook to attempt to make scores from different
queries comparable.
By default this is disabled (returns 1
), but some
implementations such as TFIDFSimilarity
override this.
queryNorm
in class Similarity
valueForNormalization
- the sum of the term normalization valuespublic abstract Similarity get(String name)
Similarity
for scoring a field.Copyright © 2000-2017 Apache Software Foundation. All Rights Reserved.