Database-support for continuous prediction queries over streaming data

  • Authors:
  • Mert Akdere;Uǧur Çetintemel;Eli Upfal

  • Affiliations:
  • Brown University, Providence, RI;Brown University, Providence, RI;Brown University, Providence, RI

  • Venue:
  • Proceedings of the VLDB Endowment
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Prediction is emerging as an essential ingredient for real-time monitoring, planning and decision support applications such as intrusion detection, e-commerce pricing and automated resource management. This paper presents a system that efficiently supports continuous prediction queries (CPQs) over streaming data using seamlessly-integrated probabilistic models. Specifically, we describe how to execute and optimize CPQs using discrete (Dynamic) Bayesian Networks as the underlying predictive model. Our primary contribution is a novel cost-based optimization framework that employs materialization, sharing, and model-specific optimization techniques to enable highly-efficient point- and range-based CPQ execution. Furthermore, we support efficient execution of top-k and threshold-based high probability queries. We characterize the behavior of our system and demonstrate significant performance gains using a prototype implementation operating on real-world network intrusion data and deployed as part of a real-time software-performance monitoring system.