Class TermStatistics
- java.lang.Object
-
- org.apache.lucene.search.TermStatistics
-
public class TermStatistics extends Object
Contains statistics for a specific termThis class holds statistics for this term across all documents for scoring purposes:
docFreq
: number of documents this term occurs in.totalTermFreq
: number of tokens for this term.
The following conditions are always true:
- All statistics are positive integers: never zero or negative.
docFreq
<=totalTermFreq
docFreq
<=sumDocFreq
of the collectiontotalTermFreq
<=sumTotalTermFreq
of the collection
Values may include statistics on deleted documents that have not yet been merged away.
Be careful when performing calculations on these values because they are represented as 64-bit integer values, you may need to cast to
double
for your use.- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
-
Constructor Summary
Constructors Constructor Description TermStatistics(BytesRef term, long docFreq, long totalTermFreq)
Creates statistics instance for a term.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description long
docFreq()
The number of documents this term occurs in.BytesRef
term()
The term text.String
toString()
long
totalTermFreq()
The total number of occurrences of this term.
-
-
-
Constructor Detail
-
TermStatistics
public TermStatistics(BytesRef term, long docFreq, long totalTermFreq)
Creates statistics instance for a term.- Parameters:
term
- Term bytesdocFreq
- number of documents containing the term in the collection.totalTermFreq
- number of occurrences of the term in the collection.- Throws:
NullPointerException
- ifterm
isnull
.IllegalArgumentException
- ifdocFreq
is negative or zero.IllegalArgumentException
- iftotalTermFreq
is less thandocFreq
.
-
-
Method Detail
-
term
public final BytesRef term()
The term text.This value is never
null
.- Returns:
- term's text, not
null
-
docFreq
public final long docFreq()
The number of documents this term occurs in.This is the document-frequency for the term: the count of documents where the term appears at least one time.
This value is always a positive number, and never exceeds
totalTermFreq
. It also cannot exceedCollectionStatistics.sumDocFreq()
.- Returns:
- document frequency, in the range [1 ..
totalTermFreq()
] - See Also:
TermsEnum.docFreq()
-
totalTermFreq
public final long totalTermFreq()
The total number of occurrences of this term.This is the token count for the term: the number of times it appears in the field across all documents.
This value is always a positive number, always at least
docFreq()
, and never exceedsCollectionStatistics.sumTotalTermFreq()
.- Returns:
- number of occurrences, in the range [
docFreq()
..CollectionStatistics.sumTotalTermFreq()
] - See Also:
TermsEnum.totalTermFreq()
-
-