A unified construction of the glushkov, follow, and antimirov automata

  • Authors:
  • Cyril Allauzen;Mehryar Mohri

  • Affiliations:
  • Courant Institute of Mathematical Sciences, New York, NY;Courant Institute of Mathematical Sciences, New York, NY

  • Venue:
  • MFCS'06 Proceedings of the 31st international conference on Mathematical Foundations of Computer Science
  • Year:
  • 2006

Quantified Score

Hi-index 0.01

Visualization

Abstract

A number of different techniques have been introduced in the last few decades to create ε-free automata representing regular expressions such as the Glushkov automata, follow automata, or Antimirov automata. This paper presents a simple and unified view of all these construction methods both for unweighted and weighted regular expressions. It describes simpler algorithms with time complexities at least as favorable as that of the best previously known techniques, and provides a concise proof of their correctness. Our algorithms are all based on two standard automata operations: epsilon-removal and minimization. This contrasts with the multitude of complicated and special-purpose techniques previously described in the literature, and makes it straightforward to generalize these algorithms to the weighted case. In particular, we extend the definition and construction of follow automata to the case of weighted regular expressions over a closed semiring and present the first algorithm to compute weighted Antimirov automata.