String operations in query languages

  • Authors:
  • Michael Benedikt;Leonid Libkin;Thomas Schwentick;Luc Segoufin

  • Affiliations:
  • Bell Laboratories, 263 Shuman Blvd, Naperville, IL;Department of Computer Science, University of Toronto, 6, King's College Road, Toronto, On tario M5S 3H5, Canada;Institut fülr Informatik, Friedrich-Schiller-Univ ersitát Jena, Ernst-Abbe-Platz 3, 07740 Jena, Germany;INRIA-Rocquencourt, B.P . 105, 78153 Le Chesna y Cedex, France

  • Venue:
  • PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

We study relational calculi with support for string operations. While SQL restricts the ability to mix string pattern-matching and relational operations, prior proposals for embedding SQL in a compositional calculus were based on adding the operation of concatenation to first-order logic. These latter proposals yield compositional query languages extending SQL, but are unfortunately computationally complete. The unbounded expressive power in turn implies strong limits on the ability to perform optimization and static analysis of properties such as query safety in these languages.In contrast, we look at compositional extensions of relational calculus that have nice expressiveness, decidability, and safety properties, while capturing string-matching queries used in SQL. We start with an extension based on the string ordering and LIKE predicates. This extension shares some of the attractive properties of relational calculus (e.g. effective syntax for safe queries, low data complexity), but lacks the full power of regular-expression pattern-matching. When we extend this basic model to include string length comparison, we get a natural string language with great expressiveness, but one which includes queries with high (albeit bounded) data complexity. We thus explore the space between these two languages. We consider two intermediate languages: the first extends our base language with functions that trim/add leading characters, and the other extends it by adding the full power of regular-expression pattern-matching. We show that both these extensions inherit many of the attractive properties of the basic model: they both have corresponding algebras expressing safe queries, and low complexity of query evaluation.