Rate-based query optimization for streaming information sources
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Introduction to Algorithms
Aurora: a new model and architecture for data stream management
The VLDB Journal — The International Journal on Very Large Data Bases
Queueing analysis of relational operators for continuous data streams
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Dynamic Load Distribution in the Borealis Stream Processor
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Providing resiliency to load variations in distributed stream processing
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
SPADE: the system s declarative stream processing engine
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Accurate latency estimation in a distributed event processing system
ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
Adaptive online scheduling in storm
Proceedings of the 7th ACM international conference on Distributed event-based systems
Hi-index | 0.00 |
In this paper we propose a formal model for characterizing latencies affecting the computation of a continuous query either in a Data Stream Management System (DSMS) or in a Complex Event Processing (CEP) system. In the model, a query can be thought of as constructed out of basic Event Processing Units (EPUs) interconnected among themselves. EPUs are modeled considering just few parameters, used to define the EPU processing logic. In order to model the continuous query we use an acyclic directed (data-flow) graph whose nodes are the EPUs and edges represent the flow of information (events) processed by the EPUs themselves. The outcome of this model is to associate with each dataflow graph a set of latency metrics, namely reactivity, activity, and output latencies, and a complexity measure - that we call data-flow graph complexity - representing the input dimension required to produce an output event. The proposed model can be used to compare and contrast different data-flow graphs in order to assess their latency metrics. This is a crucial step in selecting one of such graphs that meets at best the latency requirements imposed by the programmer before its actual submission to a DSMS or to a CEP system. Furthermore, the model can be considered an effective mean through which formally comparing dataflow graphs and predicting their behavior before an actual experimental validation phase.