ULDBs: databases with uncertainty and lineage
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Sketching probabilistic data streams
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Efficient aggregation algorithms for probabilistic data
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Efficient query evaluation on probabilistic databases
The VLDB Journal — The International Journal on Very Large Data Bases
Model-driven data acquisition in sensor networks
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
MCDB: a monte carlo approach to managing uncertain data
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Estimating statistical aggregates on probabilistic data streams
ACM Transactions on Database Systems (TODS)
Database Support for Probabilistic Attributes and Tuples
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Handling Uncertain Data in Array Database Systems
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Online Filtering, Smoothing and Probabilistic Modeling of Streaming data
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Probabilistic Inference over RFID Streams in Mobile Environments
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
The trichotomy of HAVING queries on a probabilistic database
The VLDB Journal — The International Journal on Very Large Data Bases
PODS: a new model and processing algorithms for uncertain data streams
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
AINTEC'06 Proceedings of the Second Asian international conference on Technologies for Advanced Heterogeneous Networks
CLARO: modeling and processing uncertain data streams
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
Uncertain data streams are increasingly common in real-world deployments and monitoring applications require the evaluation of complex queries on such streams. In this paper, we consider complex queries involving conditioning (e.g., selections and group by's) and aggregation operations on uncertain data streams. To characterize the uncertainty of answers to these queries, one generally has to compute the full probability distribution of each operation used in the query. Computing distributions of aggregates given conditioned tuple distributions is a hard, unsolved problem. Our work employs a new evaluation framework that includes a general data model, approximation metrics, and approximate representations. Within this framework we design fast data-stream algorithms, both deterministic and randomized, for returning approximate distributions with bounded errors as answers to those complex queries. Our experimental results demonstrate the accuracy and efficiency of our approximation techniques and offer insights into the strengths and limitations of deterministic and randomized algorithms.