Class LuceneLevenshteinDistance

java.lang.Object
org.apache.lucene.search.spell.LuceneLevenshteinDistance
All Implemented Interfaces:
StringDistance

public final class LuceneLevenshteinDistance extends Object implements StringDistance
Damerau-Levenshtein (optimal string alignment) implemented in a consistent way as Lucene's FuzzyTermsEnum with the transpositions option enabled.

Notes:

  • This metric treats full unicode codepoints as characters
  • This metric scales raw edit distances into a floating point score based upon the shortest of the two terms
  • Transpositions of two adjacent codepoints are treated as primitive edits.
  • Edits are applied in parallel: for example, "ab" and "bca" have distance 3.
NOTE: this class is not particularly efficient. It is only intended for merging results from multiple DirectSpellCheckers.
  • Constructor Details

    • LuceneLevenshteinDistance

      public LuceneLevenshteinDistance()
      Creates a new comparator, mimicing the behavior of Lucene's internal edit distance.
  • Method Details

    • getDistance

      public float getDistance(String target, String other)
      Description copied from interface: StringDistance
      Returns a float between 0 and 1 based on how similar the specified strings are to one another. Returning a value of 1 means the specified strings are identical and 0 means the string are maximally different.
      Specified by:
      getDistance in interface StringDistance
      Parameters:
      target - The first string.
      other - The second string.
      Returns:
      a float between 0 and 1 based on how similar the specified strings are to one another.
    • equals

      public boolean equals(Object obj)
      Overrides:
      equals in class Object
    • hashCode

      public int hashCode()
      Overrides:
      hashCode in class Object