Class LevenshteinAutomata


  • public class LevenshteinAutomata
    extends Object
    Class to construct DFAs that match a word within some edit distance.

    Implements the algorithm described in: Schulz and Mihov: Fast String Correction with Levenshtein Automata

    WARNING: This API is experimental and might change in incompatible ways in the next release.
    • Field Detail

      • MAXIMUM_SUPPORTED_DISTANCE

        public static final int MAXIMUM_SUPPORTED_DISTANCE
        Maximum edit distance this class can generate an automaton for.
        See Also:
        Constant Field Values
        NOTE: This API is for internal purposes only and might change in incompatible ways in the next release.
    • Constructor Detail

      • LevenshteinAutomata

        public LevenshteinAutomata​(String input,
                                   boolean withTranspositions)
        Create a new LevenshteinAutomata for some input String. Optionally count transpositions as a primitive edit.
      • LevenshteinAutomata

        public LevenshteinAutomata​(int[] word,
                                   int alphaMax,
                                   boolean withTranspositions)
        Expert: specify a custom maximum possible symbol (alphaMax); default is Character.MAX_CODE_POINT.
    • Method Detail

      • toAutomaton

        public Automaton toAutomaton​(int n)
        Compute a DFA that accepts all strings within an edit distance of n.

        All automata have the following properties:

        • They are deterministic (DFA).
        • There are no transitions to dead states.
        • They are not minimal (some transitions could be combined).
      • toAutomaton

        public Automaton toAutomaton​(int n,
                                     String prefix)
        Compute a DFA that accepts all strings within an edit distance of n, matching the specified exact prefix.

        All automata have the following properties:

        • They are deterministic (DFA).
        • There are no transitions to dead states.
        • They are not minimal (some transitions could be combined).