Optimization techniques for queries with expensive methods
ACM Transactions on Database Systems (TODS)
An Evaluation of Non-Equijoin Algorithms
VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
Travel time estimation using NiagaraST and latte
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Enabling Real-Time Querying of Live and Historical Stream Data
SSDBM '07 Proceedings of the 19th International Conference on Scientific and Statistical Database Management
Remembrance of streams past: overload-sensitive management of archived streams
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Efficient pattern matching over event streams
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
On Supporting Kleene Closure over Event Streams
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
ZStream: a cost-based query processor for adaptively detecting composite events
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Stream warehousing with DataDepot
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
DejaVu: declarative pattern matching over live and archived streams of events
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
How soccer players would do stream joins
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
DejaVu: a complex event processing system for pattern matching over live and historical data streams
Proceedings of the 5th ACM international conference on Distributed event-based system
Run-time composite event recognition
Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems
Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
RIP: run-based intra-query parallelism for scalable complex event processing
Proceedings of the 7th ACM international conference on Distributed event-based systems
Specialized storage for big numeric time series
HotStorage'13 Proceedings of the 5th USENIX conference on Hot Topics in Storage and File Systems
Hi-index | 0.00 |
Correlating complex events over live and archived data streams, which we call Pattern Correlation Queries (PCQs), provides many benefits for domains which need real time forecasting of events or identification of causal dependencies, while handling data at high rates and in massive amounts, like in financial or medical settings. Existing work has focused either on complex event processing over a single type of stream source (i.e., either live or archived), or on simple stream correlation queries (e.g., live events trigerring a database lookup). In this paper, we specifically focus on recency-based PCQs and provide clear, useful, and optimizable semantics for them. PCQs raise a number of challenges in optimizing data management and query processing, which we address in the setting of the DejaVu complex event processing system. More specifically, we propose three complementary optimizations including recent input buffering, query result caching, and join source ordering. Furthermore, we capture the relevant query processing tradeoffs in a cost model. An extensive performance study on synthetic and real-life data sets not only validates this cost model, but also shows that our optimizations are very effective, achieving more than two orders magnitude throughput improvement and much better scalability compared to a conventional approach.