Class TermStatistics
This class holds statistics for this term across all documents for scoring purposes:
docFreq
: number of documents this term occurs in.totalTermFreq
: number of tokens for this term.
The following conditions are always true:
- All statistics are positive integers: never zero or negative.
docFreq
<=totalTermFreq
docFreq
<=sumDocFreq
of the collectiontotalTermFreq
<=sumTotalTermFreq
of the collection
Values may include statistics on deleted documents that have not yet been merged away.
Be careful when performing calculations on these values because they are represented as 64-bit
integer values, you may need to cast to double
for your use.
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
Constructor Summary
ConstructorDescriptionTermStatistics
(BytesRef term, long docFreq, long totalTermFreq) Creates statistics instance for a term. -
Method Summary
-
Constructor Details
-
TermStatistics
Creates statistics instance for a term.- Parameters:
term
- Term bytesdocFreq
- number of documents containing the term in the collection.totalTermFreq
- number of occurrences of the term in the collection.- Throws:
NullPointerException
- ifterm
isnull
.IllegalArgumentException
- ifdocFreq
is negative or zero.IllegalArgumentException
- iftotalTermFreq
is less thandocFreq
.
-
-
Method Details
-
term
The term text.This value is never
null
.- Returns:
- term's text, not
null
-
docFreq
public final long docFreq()The number of documents this term occurs in.This is the document-frequency for the term: the count of documents where the term appears at least one time.
This value is always a positive number, and never exceeds
totalTermFreq
. It also cannot exceedCollectionStatistics.sumDocFreq()
.- Returns:
- document frequency, in the range [1 ..
totalTermFreq()
] - See Also:
-
totalTermFreq
public final long totalTermFreq()The total number of occurrences of this term.This is the token count for the term: the number of times it appears in the field across all documents.
This value is always a positive number, always at least
docFreq()
, and never exceedsCollectionStatistics.sumTotalTermFreq()
.- Returns:
- number of occurrences, in the range [
docFreq()
..CollectionStatistics.sumTotalTermFreq()
] - See Also:
-
toString
-