Query engine grid for executing SQL streaming process

  • Authors:
  • Qiming Chen;Meichun Hsu

  • Affiliations:
  • HP Labs, Palo Alto, California and Hewlett Packard Co.;HP Labs, Palo Alto, California and Hewlett Packard Co.

  • Venue:
  • Globe'11 Proceedings of the 4th international conference on Data management in grid and peer-to-peer systems
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many enterprise applications are based on continuous analytics of data streams. Integrating data-intensive stream processing with query processing allows us to take advantage of SQL's expressive power and DBMS's data management capability. However, it also raises serious challenges in dealing with complex dataflow, applying queries to unbounded stream data, and providing highly scalable, dynamically configurable, elastic infrastructure. In this project we tackle these problems in three dimensions. First, we model the general graph-structured, continuous dataflow analytics as a SQL Streaming Process with multiple connected and stationed continuous queries. Next, we extend the query engine to support cycle-based query execution for processing unbounded stream data in bounded chunks with sound semantics. Finally, we develop the Query Engine Grid (QE-Grid) over the Distributed Caching Platforms (DCP) as a dynamically configurable elastic infrastructure for parallel and distributed execution of SQL Streaming Processes. The proposed infrastructure is preliminarily implemented using PostgreSQL engines. Our experience shows its merit in leveraging SQL and query engines to analyze real-time, graph-structured and unbounded streams. Integrating it with a commercial and proprietary MPP based database cluster is being investigated.