public class PassageScorer extends Object
PostingsHighlighter.
Each passage is scored as a miniature document within the document.
The final score is computed as norm(int) * ∑ (weight(int, int) * tf(int, int)).
The default implementation is norm(int) * BM25.
| Modifier and Type | Field and Description |
|---|---|
static float |
b
BM25 b parameter, controls length normalization.
|
static float |
k1
BM25 k1 parameter, controls term frequency normalization
|
static float |
pivot
A pivot used for length normalization.
|
| Constructor and Description |
|---|
PassageScorer() |
| Modifier and Type | Method and Description |
|---|---|
float |
norm(int passageStart)
Normalize a passage according to its position in the document.
|
float |
tf(int freq,
int passageLen)
Computes term weight, given the frequency within the passage
and the passage's length.
|
float |
weight(int contentLength,
int totalTermFreq)
Computes term importance, given its in-document statistics.
|
public static final float k1
public static final float b
public static final float pivot
public float weight(int contentLength,
int totalTermFreq)
contentLength - length of document in characterstotalTermFreq - number of time term occurs in documentpublic float tf(int freq,
int passageLen)
freq - number of occurrences of within this passagepassageLen - length of the passage in characters.public float norm(int passageStart)
Typically passages towards the beginning of the document are more useful for summarizing the contents.
The default implementation is 1 + 1/log(pivot + passageStart)
passageStart - start offset of the passageCopyright © 2000-2013 Apache Software Foundation. All Rights Reserved.