Class NGramDistance

java.lang.Object
org.apache.lucene.search.spell.NGramDistance
All Implemented Interfaces:
StringDistance

public class NGramDistance extends Object implements StringDistance
N-Gram version of edit distance based on paper by Grzegorz Kondrak, "N-gram similarity and distance". Proceedings of the Twelfth International Conference on String Processing and Information Retrieval (SPIRE 2005), pp. 115-126, Buenos Aires, Argentina, November 2005. http://www.cs.ualberta.ca/~kondrak/papers/spire05.pdf

This implementation uses the position-based optimization to compute partial matches of n-gram sub-strings and adds a null-character prefix of size n-1 so that the first character is contained in the same number of n-grams as a middle character. Null-character prefix matches are discounted so that strings with no matching characters will return a distance of 0.

  • Constructor Details

    • NGramDistance

      public NGramDistance(int size)
      Creates an N-Gram distance measure using n-grams of the specified size.
      Parameters:
      size - The size of the n-gram to be used to compute the string distance.
    • NGramDistance

      public NGramDistance()
      Creates an N-Gram distance measure using n-grams of size 2.
  • Method Details

    • getDistance

      public float getDistance(String source, String target)
      Description copied from interface: StringDistance
      Returns a float between 0 and 1 based on how similar the specified strings are to one another. Returning a value of 1 means the specified strings are identical and 0 means the string are maximally different.
      Specified by:
      getDistance in interface StringDistance
      Parameters:
      source - The first string.
      target - The second string.
      Returns:
      a float between 0 and 1 based on how similar the specified strings are to one another.
    • hashCode

      public int hashCode()
      Overrides:
      hashCode in class Object
    • equals

      public boolean equals(Object obj)
      Overrides:
      equals in class Object
    • toString

      public String toString()
      Overrides:
      toString in class Object