Inference of regular languages using state merging algorithms with search

Authors:
Miguel Bugalho;Arlindo L. Oliveira
Affiliations:
IST/INESC-ID, R. Alves Redol 9, Lisboa 1000, Portugal;IST/INESC-ID, R. Alves Redol 9, Lisboa 1000, Portugal
Venue:
Pattern Recognition
Year:
2005

Citing 21
Cited 3

Occam's razor

Information Processing Letters
Learning regular sets from queries and counterexamples

Information and Computation
Learning automata from ordered examples

COLT '88 Proceedings of the first annual workshop on Computational learning theory
Random DFA's can be approximately learned from sparse uniform examples

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
The design and analysis of efficient learning algorithms

The design and analysis of efficient learning algorithms
The minimum consistent DFA problem cannot be approximated within any polynomial

Journal of the ACM (JACM)
Inductive learning by selection of minimal complexity representations

Inductive learning by selection of minimal complexity representations
Bayesian learning of probabilistic language models

Bayesian learning of probabilistic language models
GRASP—a new search algorithm for satisfiability

Proceedings of the 1996 IEEE/ACM international conference on Computer-aided design
Efficient Algorithms for the Inference of Minimum Size DFAs

Machine Learning
Proceedings of the 4th International Colloquium on Grammatical Inference

ICGI '98 Proceedings of the 4th International Colloquium on Grammatical Inference
Proceedings of the 5th International Colloquium on Grammatical Inference: Algorithms and Applications

ICGI '00 Proceedings of the 5th International Colloquium on Grammatical Inference: Algorithms and Applications
Proceedings of the 6th International Colloquium on Grammatical Inference: Algorithms and Applications

ICGI '02 Proceedings of the 6th International Colloquium on Grammatical Inference: Algorithms and Applications
How Considering Incompatible State Mergings May Reduce the DFA Induction Search Tree

ICGI '98 Proceedings of the 4th International Colloquium on Grammatical Inference
Results of the Abbadingo One DFA Learning Competition and a New Evidence-Driven State Merging Algorithm

ICGI '98 Proceedings of the 4th International Colloquium on Grammatical Inference
A Stochastic Search Approach to Grammar Induction

ICGI '98 Proceedings of the 4th International Colloquium on Grammatical Inference
Beyond EDSM

ICGI '02 Proceedings of the 6th International Colloquium on Grammatical Inference: Algorithms and Applications
Limits of Exact Algorithms For Inference of Minimum Size Finite State Machines

ALT '96 Proceedings of the 7th International Workshop on Algorithmic Learning Theory
Speeding up the Synthesis of Programs from Traces

IEEE Transactions on Computers
Constructing Programs from Example Computations

IEEE Transactions on Software Engineering
System identification via state characterization

Automatica (Journal of IFAC)

Exact DFA identification using SAT solvers

ICGI'10 Proceedings of the 10th international colloquium conference on Grammatical inference: theoretical results and applications
Regular inference as vertex coloring

ALT'12 Proceedings of the 23rd international conference on Algorithmic Learning Theory
Software model synthesis using satisfiability solvers

Empirical Software Engineering

Quantified Score

Hi-index	0.01

Visualization

Abstract

State merging algorithms have emerged as the solution of choice for the problem of inferring regular grammars from labeled samples, a known NP-complete problem of great importance in the grammatical inference area. These methods derive a small deterministic finite automaton from a set of labeled strings (the training set), by merging parts of the acceptor that corresponds to this training set. Experimental and theoretical evidence have shown that the generalization ability exhibited by the resulting automata is highly correlated with the number of states in the final solution. As originally proposed, state merging algorithms do not perform search. This means that they are fast, but also means that they are limited by the quality of the heuristics they use to select the states to be merged. Sub-optimal choices lead to automata that have many more states than needed and exhibit poor generalization ability. In this work, we survey the existing approaches that generalize state merging algorithms by using search to explore the tree that represents the space of possible sequences of state mergings. By using heuristic guided search in this space, many possible state merging sequences can be considered, leading to smaller automata and improved generalization ability, at the expense of increased computation time. We present comparisons of existing algorithms that show that, in widely accepted benchmarks, the quality of the derived solutions is improved by applying this type of search. However, we also point out that existing algorithms are not powerful enough to solve the more complex instances of the problem, leaving open the possibility that better and more powerful approaches need to be designed.