On the intersection of regex languages with regular languages

  • Authors:
  • Cezar Câmpeanu;Nicolae Santean

  • Affiliations:
  • Department of Computer Science and Information Technology, University of Prince Edward Island, Canada;Department of Computer and Information Sciences, Indiana University South Bend, IN, USA

  • Venue:
  • Theoretical Computer Science
  • Year:
  • 2009

Quantified Score

Hi-index 5.23

Visualization

Abstract

In this paper we revisit the semantics of extended regular expressions (regex), defined succinctly in the 90s [A.V. Aho, Algorithms for finding patterns in strings, in: Jan van Leeuwen (Ed.), Handbook of Theoretical Computer Science, in: Algorithms and Complexity, vol. A, Elsevier and MIT Press, 1990, pp. 255-300] and rigorously in 2003 by Campeanu, Salomaa and Yu [C. Campeanu, K. Salomaa, S. Yu, A formal study of practical regular expressions, IJFCS 14 (6) (2003) 1007-1018], when the authors reported an open problem, namely whether regex languages are closed under the intersection with regular languages. We give a positive answer; and for doing so, we propose a new class of machines - regex automata systems (RAS) - which are equivalent to regex. Among others, these machines provide a consistent and convenient method of implementing regex in practice. We also prove, as a consequence of this closure property, that several languages, such as the mirror language, the language of palindromes, and the language of balanced words are not regex languages.