The CQL continuous query language: semantic foundations and query execution
The VLDB Journal — The International Journal on Very Large Data Bases
Experiences with MapReduce, an abstraction for large-scale computation
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
SPADE: the system s declarative stream processing engine
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Clustera: an integrated computation and data management system
Proceedings of the VLDB Endowment
Data-Continuous SQL Process Model
OTM '08 Proceedings of the OTM 2008 Confederated International Conferences, CoopIS, DOA, GADA, IS, and ODBASE 2008. Part I on On the Move to Meaningful Internet Systems:
Exploiting the power of relational databases for efficient stream processing
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Extend UDF Technology for Integrated Analytics
DaWaK '09 Proceedings of the 11th International Conference on Data Warehousing and Knowledge Discovery
Efficiently support MapReduce-like computation models inside parallel DBMS
IDEAS '09 Proceedings of the 2009 International Database Engineering & Applications Symposium
Hi-index | 0.00 |
In the era of information explosion, huge amount of data are generated from various sensing devices continuously, which are often too low level for analytics purpose, and too massive to load to data-warehouses for filtering and summarizing with the reasonable latency. Distributed stream analytics for multilevel abstraction is the key to solve this problem. We advocate a distributed infrastructure for CDR (Call Detail Record) stream analytics in the telecommunication network where the stream processing is integrated into the database engine, and carried out in terms of continuous querying; the computation model is based on network-distributed (rather than clustered) Map-Reduce scheme. We propose the window based cooperation mechanism for having multiple engines synchronized and cooperating on the data falling in a common window boundary, based on time, cardinality, etc. This mechanism allows the engines to cooperate window by window without centralized coordination. We further propose the quantization mechanism for integrating the discretization and abstraction of continuous-valued data, for efficient and incremental data reduction, and in turn, network data movement reduction. These mechanisms provide the key roles in scaling out CDR stream analytics. The proposed approach has been integrated into the PostgreSQL engine. Our preliminary experiments reveal its merit for large-scale distributed stream processing.