public class TermStatistics extends ObjectContains statistics for a specific term
This class holds statistics for this term across all documents for scoring purposes:
docFreq: number of documents this term occurs in.
totalTermFreq: number of tokens for this term.
The following conditions are always true:
- All statistics are positive integers: never zero or negative.
sumDocFreqof the collection
sumTotalTermFreqof the collection
Values may include statistics on deleted documents that have not yet been merged away.
Be careful when performing calculations on these values because they are represented as 64-bit integer values, you may need to cast to
doublefor your use.
- WARNING: This API is experimental and might change in incompatible ways in the next release.
Constructors Constructor Description
TermStatistics(BytesRef term, long docFreq, long totalTermFreq)Creates statistics instance for a term.
All Methods Instance Methods Concrete Methods Modifier and Type Method Description
docFreq()The number of documents this term occurs in.
term()The term text.
totalTermFreq()The total number of occurrences of this term.
public TermStatistics(BytesRef term, long docFreq, long totalTermFreq)Creates statistics instance for a term.
term- Term bytes
docFreq- number of documents containing the term in the collection.
totalTermFreq- number of occurrences of the term in the collection.
docFreqis negative or zero.
totalTermFreqis less than
public final BytesRef term()The term text.
This value is never
- term's text, not
public final long docFreq()The number of documents this term occurs in.
This is the document-frequency for the term: the count of documents where the term appears at least one time.
This value is always a positive number, and never exceeds
totalTermFreq. It also cannot exceed
- document frequency, in the range [1 ..
- See Also:
public final long totalTermFreq()The total number of occurrences of this term.
This is the token count for the term: the number of times it appears in the field across all documents.
This value is always a positive number, always at least
docFreq(), and never exceeds
- number of occurrences, in the range [
- See Also: