Simplifying regular expressions: a quantitative perspective

Authors:
Hermann Gruber;Stefan Gulan
Affiliations:
Institut für Informatik, Universität Gießen, Gießen, Germany;Fachbereich IV—Informatik, Universität Trier, Trier, Germany
Venue:
LATA'10 Proceedings of the 4th international conference on Language and Automata Theory and Applications
Year:
2010

Citing 15
Cited 1

Regular expressions into finite automata

Theoretical Computer Science
Term rewriting and all that

Term rewriting and all that
On a question of A. Salomaa: the equational theory of regular expression

Theoretical Computer Science
Programming Techniques: Regular expression search algorithm

Communications of the ACM
Theory of Computation: A Primer

Theory of Computation: A Primer
Follow automata

Information and Computation
Regular expressions: new results and open problems

Journal of Automata, Languages and Combinatorics
Finite Automata, Digraph Connectivity, and Regular Expression Size

ICALP '08 Proceedings of the 35th international colloquium on Automata, Languages and Programming, Part II
The equivalence problem for regular expressions with squaring requires exponential space

SWAT '72 Proceedings of the 13th Annual Symposium on Switching and Automata Theory (swat 1972)
Multi-tilde Operators and Their Glushkov Automata

LATA '09 Proceedings of the 3rd International Conference on Language and Automata Theory and Applications
Faster Regular Expression Matching

ICALP '09 Proceedings of the 36th International Colloquium on Automata, Languages and Programming: Part I
Optimizing Schema Languages for XML: Numerical Constraints and Interleaving

SIAM Journal on Computing
The effect of rewriting regular expressions on their accepting automata

CIAA'03 Proceedings of the 8th international conference on Implementation and application of automata
Optimal lower bounds on regular expression size using communication complexity

FOSSACS'08/ETAPS'08 Proceedings of the Theory and practice of software, 11th international conference on Foundations of software science and computational structures
Enumerating regular expressions and their languages

CIAA'04 Proceedings of the 9th international conference on Implementation and Application of Automata

The complexity of regular(-like) expressions

DLT'10 Proceedings of the 14th international conference on Developments in language theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider the efficient simplification of regular expressions and suggest a quantitative comparison of heuristics for simplifying regular expressions. To this end, we propose a new normal form for regular expressions, which outperforms previous heuristics while still being computable in linear time. This allows us to determine an exact bound for the relation between the two prevalent measures for regular expression - size: alphabetic width and reverse polish notation length. In addition, we show that every regular expression of alphabetic width n can be converted into a nondeterministic finite automaton with ε-transitions of size at most $4\frac25n+1$, and prove this bound to be optimal. This answers a question posed by Ilie and Yu, who had obtained lower and upper bounds of 4n−1 and $9n-\frac12$, respectively [15]. For reverse polish notation length as input size measure, an optimal bound was recently determined by Gulan and Fernau [14]. We prove that, under mild restrictions, their construction is also optimal when taking alphabetic width as input size measure.