Experience in Continuous analytics as a Service (CaaaS)

Authors:
Qiming Chen;Meichun Hsu;Hans Zeller
Affiliations:
HP Labs, Palo Alto, CA;HP Labs, Palo Alto, CA;HP SW NED, Cupertino, CA
Venue:
Proceedings of the 14th International Conference on Extending Database Technology
Year:
2011

Citing 12
Cited 9

TelegraphCQ: continuous dataflow processing

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Aurora: a new model and architecture for data stream management

The VLDB Journal — The International Journal on Very Large Data Bases
Design, implementation, and evaluation of the linear road bnchmark on the stream processing core

Proceedings of the 2006 ACM SIGMOD international conference on Management of data
The CQL continuous query language: semantic foundations and query execution

The VLDB Journal — The International Journal on Very Large Data Bases
Experiences with MapReduce, an abstraction for large-scale computation

Proceedings of the 15th international conference on Parallel architectures and compilation techniques
SPADE: the system s declarative stream processing engine

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Clustera: an integrated computation and data management system

Proceedings of the VLDB Endowment
Exploiting the power of relational databases for efficient stream processing

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Extend UDF Technology for Integrated Analytics

DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
Efficiently support MapReduce-like computation models inside parallel DBMS

IDEAS '09 Proceedings of the 2009 International Database Engineering & Applications Symposium
HadoopDB: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads

Proceedings of the VLDB Endowment
Continuous mapreduce for In-DB stream analytics

OTM'10 Proceedings of the 2010 international conference on On the move to meaningful internet systems

The fix-point method for discrete events simulation using SQL and UDF

DEXA'11 Proceedings of the 22nd international conference on Database and expert systems applications - Volume Part II
Query engine grid for executing SQL streaming process

Globe'11 Proceedings of the 4th international conference on Data management in grid and peer-to-peer systems
SQL streaming process in query engine net

OTM'11 Proceedings of the 2011th Confederated international conference on On the move to meaningful internet systems - Volume Part I
Continuous access to cloud event services with event pipe queries

OTM'11 Proceedings of the 2011th Confederated international conference on On the move to meaningful internet systems - Volume Part II
Extend core UDF framework for GPU-enabled analytical query evaluation

Proceedings of the 15th Symposium on International Database Engineering & Applications
Massively-parallel stream processing under QoS constraints with Nephele

Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
Stream-join revisited in the context of epoch-based SQL continuous query

Proceedings of the 16th International Database Engineering & Applications Sysmposium
A new paradigm for collaborating distributed query engines

DaWaK'12 Proceedings of the 14th international conference on Data Warehousing and Knowledge Discovery
Leveraging the capabilities of service-oriented decision support systems: Putting analytics and big data in cloud

Decision Support Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Mobile applications, such as those on WebOS, increasingly depend on continuous analytics results of real-time events, for monitoring oil & gas production, watching traffic status and detecting accident, etc, which has given rise to the need of providing Continuous analytics as a Service (CaaaS). While representing a paradigm shift in cloud computing, CaaaS poses several challenges in scalability, latency, time-window semantics, transaction control and result-set staging. A data stream is infinite thus can only be analyzed in granules. We propose a continuous query model over both static relations and dynamic streaming data, which allows a long-standing SQL query instance to run cycle by cycle, each cycle for a chunk of data from the data stream, using a cut-and-rewind mechanism. We further support the cycle-based transaction model with cycle-based isolation and visibility, for delivering analytics results to the clients continuously while the query is running. To have the continuously generated analytics results staged efficiently, we developed the table-ring and label switching mechanism characterized by staging data through metadata manipulation without physical data moving and copying. To scale-out analytics computation, we support both parallel database based and network distributed Map-Reduce based infrastructure with multiple cooperating engines. We have built the proposed infrastructure by extending the PostgreSQL engine. We tested the throughput and latency of this service based on a well-known stream processing benchmark; the results show that the proposed approach is highly competitive. Our experiments indicate that the database technology can be extended and applied to real-time continuous analytics service provisioning.