Incomplete Information in Relational Databases
Journal of the ACM (JACM)
A catalog of complexity classes
Handbook of theoretical computer science (vol. A)
A probabilistic relational algebra for the integration of information retrieval and database systems
ACM Transactions on Information Systems (TOIS)
ProbView: a flexible probabilistic database system
ACM Transactions on Database Systems (TODS)
The complexity of query reliability
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Foundations of Databases: The Logical Level
Foundations of Databases: The Logical Level
ICDT '03 Proceedings of the 9th International Conference on Database Theory
Databases for interval probabilities: Research Articles
International Journal of Intelligent Systems - Uncertain Reasoning (Part 2)
Clean Answers over Dirty Databases: A Probabilistic Approach
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Efficient query evaluation on probabilistic databases
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Event queries on correlated probabilistic streams
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Approximating predicates and expressive queries on probabilistic databases
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Incorporating constraints in probabilistic XML
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Query evaluation with soft-key constraints
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Monte-Carlo algorithms for enumeration and reliability problems
SFCS '83 Proceedings of the 24th Annual Symposium on Foundations of Computer Science
Conditioning probabilistic databases
Proceedings of the VLDB Endowment
Fast and Simple Relational Processing of Uncertain Data
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
On Query Algebras for Probabilistic Databases
ACM SIGMOD Record
The VLDB Journal — The International Journal on Very Large Data Bases
On probabilistic fixpoint and Markov chain query languages
Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Hi-index | 0.00 |
The ability to flexibly compose confidence computation with the operations of relational algebra is an important feature of probabilistic database query languages. Computing confidences is computationally hard, however, and has to be approximated in practice. In a compositional query language, even very small errors caused by approximation can lead to an entirely incorrect result: A selection operation on an approximated probability can incorrectly keep or drop a tuple even if the probability value has been approximated to a very narrow confidence interval. In this paper, we study the query evaluation problem for compositional query languages for probabilistic databases with particular focus on providing overall result quality guarantees in the face of approximate intermediate results. We present a framework for evaluating compositional queries based on a new representation system that can capture uncertainty about probabilities. More specifically, we consider probability intervals instead of exact probabilities, interpreting tuples obtained by selection on approximate values as unreliable. We study the complexity of query evaluation over our new model. We present efficient confidence computation algorithms which compute bounds that are close to tight for important classes. For deciding a selection predicate, we show that no efficient randomized algorithm exists unless BPP⊃NP. Still we are able to efficiently guess robust predicates with a good error bound. Putting all these pieces together in our framework, we evaluate queries using a decomposition into a relational algebra plan and an approximation plan. The latter allows to successively improve accuracy and error bounds, while the relational algebra plan only has to be executed once.