Class NormalizationH1


  • public class NormalizationH1
    extends Normalization
    Normalization model that assumes a uniform distribution of the term frequency.

    While this model is parameterless in the original article, information-based models (see IBSimilarity) introduced a multiplying factor. The default value for the c parameter is 1.

    WARNING: This API is experimental and might change in incompatible ways in the next release.
    • Constructor Detail

      • NormalizationH1

        public NormalizationH1​(float c)
        Creates NormalizationH1 with the supplied parameter c.
        Parameters:
        c - hyper-parameter that controls the term frequency normalization with respect to the document length.
    • Method Detail

      • tfn

        public final double tfn​(BasicStats stats,
                                double tf,
                                double len)
        Description copied from class: Normalization
        Returns the normalized term frequency.
        Specified by:
        tfn in class Normalization
        len - the field length.
      • explain

        public Explanation explain​(BasicStats stats,
                                   double tf,
                                   double len)
        Description copied from class: Normalization
        Returns an explanation for the normalized term frequency.

        The default normalization methods use the field length of the document and the average field length to compute the normalized term frequency. This method provides a generic explanation for such methods. Subclasses that use other statistics must override this method.

        Overrides:
        explain in class Normalization
      • toString

        public String toString()
        Description copied from class: Normalization
        Subclasses must override this method to return the code of the normalization formula. Refer to the original paper for the list.
        Specified by:
        toString in class Normalization