Learning Deterministic Finite Automata from Smallest Counterexamples

  • Authors:
  • Andreas Birkendorf;Andreas Böker;Hans Ulrich Simon

  • Affiliations:
  • -;-;-

  • Venue:
  • SIAM Journal on Discrete Mathematics
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

We show in this paper (which appeared in a preliminary form as an extended abstract in [Proceedings of the 9th International ACM--SIAM Symposium on Discrete Algorithms, ACM, 1998]) that deterministic finite automata (DFAs) with n states and input alphabet $\Sigma$ can efficiently be learned from less than $|\Sigma|n^2$ smallest counterexamples. This improves on an earlier result of Ibarra and Jiang who required $|\Sigma|n^3$ smallest counterexamples. We present a general strategy which learns a finite concept class ${\cal F}$ from $\lfloor\log{\cal F}\rfloor$ smallest counterexamples (but not necessarily efficiently). An application to DFAs with at most $n$ states shows that $(1+o(1))|\Sigma|n\log n$ smallest counterexamples are sufficient (if efficiency is not an issue). We show next that the special DFAs operating on input words of an arbitrary but fixed length (the so-called leveled DFAs) are efficiently learnable from $(1+o(1))|\Sigma|n\log n$ smallest counterexamples. This improves on an earlier result of Ibarra and Jiang who required $|\Sigma|n^2$ smallest counterexamples. Furthermore, we present a general lower bound on the number of smallest counterexamples (required by any learning algorithm). This bound can be stated in terms of a (new) combinatorial dimension associated with the target class. A computation of this dimension for leveled or arbitrary DFAs leads to a lower bound of the form $(\frac{1}{4}+o(1))|\Sigma|n\log n$. This bound matches the aforementioned upper bounds modulo a constant of approximately 4. Finally, we present a general conversion of algorithms learning from smallest counterexamples into algorithms performing self-directed learning. For the particular classes of leveled or arbitrary DFAs, this conversion leads to self-directed learners making the smallest possible number of mistakes (modulo a constant of approximately 4). A similar remark is valid for the class of multiplicity automata (MAs).