Query engine grid for executing SQL streaming process

Authors:
Qiming Chen;Meichun Hsu
Affiliations:
HP Labs, Palo Alto, California and Hewlett Packard Co.;HP Labs, Palo Alto, California and Hewlett Packard Co.
Venue:
Globe'11 Proceedings of the 4th international conference on Data management in grid and peer-to-peer systems
Year:
2011

Citing 10
Cited 0

The CQL continuous query language: semantic foundations and query execution

The VLDB Journal — The International Journal on Very Large Data Bases
Experiences with MapReduce, an abstraction for large-scale computation

Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Dryad: distributed data-parallel programs from sequential building blocks

Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Pig latin: a not-so-foreign language for data processing

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Exploiting the power of relational databases for efficient stream processing

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
SFL: a structured dataflow language based on SQL and FP

DEXA'10 Proceedings of the 21st international conference on Database and expert systems applications: Part I
Experience in extending query engine for continuous analytics

DaWaK'10 Proceedings of the 12th international conference on Data warehousing and knowledge discovery
Distributed caching platforms

Proceedings of the VLDB Endowment
Continuous mapreduce for In-DB stream analytics

OTM'10 Proceedings of the 2010 international conference on On the move to meaningful internet systems
Experience in Continuous analytics as a Service (CaaaS)

Proceedings of the 14th International Conference on Extending Database Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many enterprise applications are based on continuous analytics of data streams. Integrating data-intensive stream processing with query processing allows us to take advantage of SQL's expressive power and DBMS's data management capability. However, it also raises serious challenges in dealing with complex dataflow, applying queries to unbounded stream data, and providing highly scalable, dynamically configurable, elastic infrastructure. In this project we tackle these problems in three dimensions. First, we model the general graph-structured, continuous dataflow analytics as a SQL Streaming Process with multiple connected and stationed continuous queries. Next, we extend the query engine to support cycle-based query execution for processing unbounded stream data in bounded chunks with sound semantics. Finally, we develop the Query Engine Grid (QE-Grid) over the Distributed Caching Platforms (DCP) as a dynamically configurable elastic infrastructure for parallel and distributed execution of SQL Streaming Processes. The proposed infrastructure is preliminarily implemented using PostgreSQL engines. Our experience shows its merit in leveraging SQL and query engines to analyze real-time, graph-structured and unbounded streams. Integrating it with a commercial and proprietary MPP based database cluster is being investigated.