Computing the edit distance of a regular language

  • Authors:
  • Stavros Konstantinidis

  • Affiliations:
  • Department of Mathematics and Computing Science, Saint Mary's University, Halifax, Nova Scotia, Canada B3H 3C3

  • Venue:
  • Information and Computation
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The edit distance (or Levenshtein distance) between two words is the smallest number of substitutions, insertions, and deletions of symbols that can be used to transform one of the words into the other. In this paper, we consider the problem of computing the edit distance of a regular language (the set of words accepted by a given finite automaton). This quantity is the smallest edit distance between any pair of distinct words of the language. We show that the problem is of polynomial time complexity. In particular, for a given finite automaton A with n transitions, over an alphabet of r symbols, our algorithm operates in time O(n2r2q2( q+r)), where q is either the diameter of A (if A is deterministic), or the square of the number of states in A (if A is nondeterministic). Incidentally, we also obtain an upper bound on the edit distance of a regular language in terms of the automaton accepting the language.