From regular expressions to deterministic automata
Theoretical Computer Science
Regular expression pattern matching for XML
POPL '01 Proceedings of the 28th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Translating Regular Expressions into Small epsilon-Free Nondeterministic Finite Automata
STACS '97 Proceedings of the 14th Annual Symposium on Theoretical Aspects of Computer Science
CDuce: an XML-centric general-purpose language
ICFP '03 Proceedings of the eighth ACM SIGPLAN international conference on Functional programming
Hedge Pattern Partial Derivative
CIAA '09 Proceedings of the 14th International Conference on Implementation and Application of Automata
Typed and unambiguous pattern matching on strings using regular expressions
Proceedings of the 12th international ACM SIGPLAN symposium on Principles and practice of declarative programming
Hi-index | 0.00 |
Pattern matching combined with regular expressions has many applications including semistructured data matching and lexical analysis in compilers. Variables in patterns allow one to refer to parts of the matching input. But some regular patterns suffer from inherent ambiguity, yielding more than one valid result. A match policy like shortest or longest match can disambiguate such patterns.In this paper, we show that regular pattern matching corresponds to sequential transduction. We derive straightforward ways to optimally compile regular patterns to sequential machines and to decide when regular patterns are unambiguous. Unambiguous patterns can be matched in a single traversal of the input. Ambiguities in patterns correspond to nondeterminism in sequential machines. Applying the match policy optimally yields two deterministic sequential machines, which produce the shortest match in two consecutive runs.