PODS: a new model and processing algorithms for uncertain data streams

  • Authors:
  • Thanh T.L. Tran;Liping Peng;Boduo Li;Yanlei Diao;Anna Liu

  • Affiliations:
  • University of Massachusetts, Amherst, Amherst, MA, USA;University of Massachusetts, Amherst, Amherst, MA, USA;University of Massachusetts, Amherst, Amherst, MA, USA;University of Massachusetts, Amherst, Amherst, MA, USA;University of Massachusetts, Amherst, Amherst, MA, USA

  • Venue:
  • Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Uncertain data streams, where data is incomplete, imprecise, and even misleading, have been observed in many environments. Feeding such data streams to existing stream systems produces results of unknown quality, which is of paramount concern to monitoring applications. In this paper, we present the PODS system that supports stream processing for uncertain data naturally captured using continuous random variables. PODS employs a unique data model that is flexible and allows efficient computation. Built on this model, we develop evaluation techniques for complex relational operators, i.e., aggregates and joins, by exploring advanced statistical theory and approximation. Evaluation results show that our techniques can achieve high performance while satisfying accuracy requirements, and significantly outperform a state-of-the-art sampling method. A case study further shows that our techniques can enable a tornado detection system (for the first time) to produce detection results at stream speed and with much improved quality.