Query languages and data models for database sequences and data streams

  • Authors:
  • Yan-Nei Law;Haixun Wang;Carlo Zaniolo

  • Affiliations:
  • Computer Science Dept., UCLA, Los Angeles, CA;IBM T. J. Watson Research, Hawthorne, NY;Computer Science Dept., UCLA, Los Angeles, CA

  • Venue:
  • VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

We study the fundamental limitations of relational algebra (RA) and SQL in supporting sequence and stream queries, and present effective query language and data model enrichments to deal with them. We begin by observing the well-known limitations of SQL in application domains which are important for data streams, such as sequence queries and data mining. Then we present a formal proof that, for continuous queries on data streams, SQL suffers from additional expressive power problems. We begin by focusing on the notion of nonblocking (NB) queries that are the only continuous queries that can be supported on data streams. We characterize the notion of nonblocking queries by showing that they are equivalent to monotonic queries. Therefore the notion of NB-completeness for RA can be formalized as its ability to express all monotonic queries expressible in RA using only the monotonic operators of RA. We show that RA is not NB-complete, and SQL is not more powerful than RA for monotonic queries. To solve these problems, we propose extensions that allow SQL to support all the monotonic queries expressible by a Turing machine using only monotonic operators. We show that these extensions are (i) user-defined aggregates (UDAs) natively coded in SQL (rather than in an external language), and (ii) a generalization of the union operator to support the merging of multiple streams according to their timestamps. These query language extensions require matching extensions to basic relational data model to support sequences explicitly ordered by times-tamps. Along with the formulation of very powerful queries, the proposed extensions entail more efficient expressions for many simple queries. In particular, we show that nonblocking queries are simple to characterize according to their syntactic structure.