Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
From regular expressions to deterministic automata
Theoretical Computer Science
A Four Russians algorithm for regular expression pattern matching
Journal of the ACM (JACM)
A new approach to text searching
Communications of the ACM
Fast text searching: allowing errors
Communications of the ACM
Regular expressions into finite automata
Theoretical Computer Science
Programming Techniques: Regular expression search algorithm
Communications of the ACM
Fast and flexible string matching by combining bit-parallelism and suffix automata
Journal of Experimental Algorithmics (JEA)
Fast Regular Expression Search
WAE '99 Proceedings of the 3rd International Workshop on Algorithm Engineering
From Regular Expressions to DFA's Using Compressed NFA's
CPM '92 Proceedings of the Third Annual Symposium on Combinatorial Pattern Matching
Compressing regular expressions' DFA table by matrix decomposition
CIAA'10 Proceedings of the 15th international conference on Implementation and application of automata
Reducing the size of NFAs by using equivalences and preorders
CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
Brute force determinization of NFAs by means of state covers
CIAA'04 Proceedings of the 9th international conference on Implementation and Application of Automata
Regular expression sub-matching using partial derivatives
Proceedings of the 14th symposium on Principles and practice of declarative programming
Improving regular-expression matching on strings using negative factors
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Hi-index | 0.00 |
We present a new technique to encode a deterministic finite automaton (DFA). Based on the specific properties of Glushkov's nondeterministic finite automaton (NFA) construction algorithm, we are able to encode the DFA using (m+1)(2m+1 +|Σ|) bits, where m is the number of characters (excluding operator symbols) in the regular expression and Σ is the alphabet. This compares favorably against the worst case of (m + 1)2m+1|Σ| bits needed by a classical DFA representation and m(22m+1 + |Σ|) bits needed by the Wu and Manber approach implemented in Agrep. Our approach is practical and simple to implement, and it permits searching regular expressions of moderate size (which include most cases of interest) faster than with any previously existing algorithm, as we show experimentally.