Measuring performance of complex event processing systems

Authors:
Torsten Grabs;Ming Lu
Affiliations:
Microsoft StreamInsight, Microsoft Corp., One Microsoft Way, Redmond, WA;Microsoft StreamInsight, Microsoft Corp., One Microsoft Way, Redmond, WA
Venue:
TPCTC'11 Proceedings of the Third TPC Technology conference on Topics in Performance Evaluation, Measurement and Characterization
Year:
2011

Citing 12
Cited 0

Virtual time

ACM Transactions on Programming Languages and Systems (TOPLAS)
Performance evaluation of object-oriented active database systems using the BEAST benchmark

Theory and Practice of Object Systems
Tracing the lineage of view data in a warehousing environment

ACM Transactions on Database Systems (TODS)
Chain: operator scheduling for memory minimization in data stream systems

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Parallel simulation: distributed simulation systems

Proceedings of the 35th conference on Winter simulation: driving innovation
Load shedding in a data stream manager

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Linear road: a stream data management benchmark

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
A Cost-Based Approach to Adaptive Resource Management in Data Stream Systems

IEEE Transactions on Knowledge and Data Engineering
Out-of-order processing: a new architecture for high-performance stream systems

Proceedings of the VLDB Endowment
Performance evaluation of message-oriented middleware using the SPECjms2007 benchmark

Performance Evaluation
Microsoft CEP server and online behavioral targeting

Proceedings of the VLDB Endowment
Accurate latency estimation in a distributed event processing system

ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Complex Event Processing (CEP) or stream data processing are becoming increasingly popular as the platform underlying event-driven solutions and applications in industries such as financial services, oil & gas, smart grids, health care, and IT monitoring. Satisfactory performance is crucial for any solution across these industries. Typically, performance of CEP engines is measured as (1) data rate, i.e., number of input events processed per second, and (2) latency, which denotes the time it takes for the result (output events) to emerge from the system after the business event (input event) happened. While data rates are typically easy to measure by capturing the numbers of input events over time, latency is less well defined. As it turns out, a definition becomes particularly challenging in the presence of data arriving out of order. That means that the order in which events arrive at the system is different from the order of their timestamps. Many important distributed scenarios need to deal with out-of-order arrival because communication delays easily introduce disorder. With out-of-order arrival, a CEP system cannot produce final answers as events arrive. Instead, time first needs to progress enough in the overall system before correct results can be produced. This introduces additional latency beyond the time it takes the system to perform the processing of the events. We denote the former as information latency and the latter as system latency. This paper discusses both types of latency in detail and defines them formally without depending on particular semantics of the CEP query plans. In addition, the paper suggests incorporating these definitions as metrics into the benchmarks that are being used to assess and compare CEP systems.