Minimisation of acyclic deterministic automata in linear time
Theoretical Computer Science - Selected papers of the Combinatorial Pattern Matching School
Text algorithms
Efficient string matching: an aid to bibliographic search
Communications of the ACM
Introduction To Automata Theory, Languages, And Computation
Introduction To Automata Theory, Languages, And Computation
Applied Combinatorics on Words (Encyclopedia of Mathematics and its Applications)
Applied Combinatorics on Words (Encyclopedia of Mathematics and its Applications)
Algorithms on Strings
Building the minimal automaton of A*X in linear time, when X is of bounded cardinality
CPM'10 Proceedings of the 21st annual conference on Combinatorial pattern matching
Hi-index | 0.00 |
A classical construction of Aho and Corasick solves the pattern matching problem for a finite set of words X in linear time, where the size of the input X is the sum of the lengths of its elements. It produces an automaton that recognizes A*X, where A is a finite alphabet, but which is generally not minimal. As an alternative to classical minimization algorithms, which yields a ${\mathcal O}(n\log n)$ solution to the problem, we propose a linear pseudo-minimization algorithm specific to Aho-Corasick automata, which produces an automaton whose size is between the size of the input automaton and the one of its associated minimal automaton. Moreover this algorithm generically computes the minimal automaton: for a large variety of natural distributions the probability that the output is the minimal automaton of A*X tends to one as the size of X tends to infinity.