Class NormalizationH2


  • public class NormalizationH2
    extends Normalization
    Normalization model in which the term frequency is inversely related to the length.

    While this model is parameterless in the original article, the thesis introduces the parameterized variant. The default value for the c parameter is 1.

    WARNING: This API is experimental and might change in incompatible ways in the next release.
    • Constructor Detail

      • NormalizationH2

        public NormalizationH2​(float c)
        Creates NormalizationH2 with the supplied parameter c.
        Parameters:
        c - hyper-parameter that controls the term frequency normalization with respect to the document length.
    • Method Detail

      • tfn

        public final double tfn​(BasicStats stats,
                                double tf,
                                double len)
        Description copied from class: Normalization
        Returns the normalized term frequency.
        Specified by:
        tfn in class Normalization
        len - the field length.
      • explain

        public Explanation explain​(BasicStats stats,
                                   double tf,
                                   double len)
        Description copied from class: Normalization
        Returns an explanation for the normalized term frequency.

        The default normalization methods use the field length of the document and the average field length to compute the normalized term frequency. This method provides a generic explanation for such methods. Subclasses that use other statistics must override this method.

        Overrides:
        explain in class Normalization
      • toString

        public String toString()
        Description copied from class: Normalization
        Subclasses must override this method to return the code of the normalization formula. Refer to the original paper for the list.
        Specified by:
        toString in class Normalization