Hancock: a language for extracting signatures from data streams
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Time-Stratified Sampling for Approximate Answers to Aggregate Queries
DASFAA '03 Proceedings of the Eighth International Conference on Database Systems for Advanced Applications
Task Assignment with Unknown Duration
ICDCS '00 Proceedings of the The 20th International Conference on Distributed Computing Systems ( ICDCS 2000)
STREAM: the stanford stream data manager (demonstration description)
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
TelegraphCQ: continuous dataflow processing
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Aurora: a new model and architecture for data stream management
The VLDB Journal — The International Journal on Very Large Data Bases
GATES: A Grid-Based Middleware for Processing Distributed Data Streams
HPDC '04 Proceedings of the 13th IEEE International Symposium on High Performance Distributed Computing
Control-Based Scheduling in a Distributed Stream Processing System
SCW '06 Proceedings of the IEEE Services Computing Workshops
Monitoring streams: a new class of data management applications
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
A comparison of approaches to large-scale data analysis
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
IEEE Transactions on Information Theory
HadoopDB: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads
Proceedings of the VLDB Endowment
HadoopDB in action: building real world applications
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
StreamCloud: A Large Scale Data Streaming System
ICDCS '10 Proceedings of the 2010 IEEE 30th International Conference on Distributed Computing Systems
Hadoop++: making a yellow elephant run like a cheetah (without it even noticing)
Proceedings of the VLDB Endowment
S4: Distributed Stream Computing Platform
ICDMW '10 Proceedings of the 2010 IEEE International Conference on Data Mining Workshops
Efficient processing of data warehousing queries in a split execution environment
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
A predictable storage model for scalable parallel DW
Proceedings of the 15th Symposium on International Database Engineering & Applications
Moving range queries in distributed complex event processing
Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems
TEEPA: a timely-aware elastic parallel architecture
Proceedings of the 16th International Database Engineering & Applications Sysmposium
Hi-index | 0.00 |
Parallel share-nothing architectures are currently used to handle large amounts of data arriving in real-time for processing. The continuous increase on data volume and organization, introduce several limitations to scalability and quality of service (QoS) due to processing requirements and joins. Parallelism may improve query performance, however some business require timely results (results not faster or slower than specified) which, even with additional parallelism and significant upgrade costs (both monetary and due to disturbance of normal operations), cannot be guaranteed. We propose a timely-aware execution architecture, Cloudy, which balances data and queries processing among an elastic set of non-dedicated and heterogeneous nodes in order to provide scale-out performance and timely results, nor faster or slower, using both Complex Event Processing (CEP) and database (DB). Data is distributed by nodes accordingly with their hardware characteristics, then a set of layered mechanisms rearrange queries in order to provide in timely results. We present experimental evaluation of Cloudy and demonstrate its ability to provide timely results.