A compositional query algebra for second-order logic and uncertain databases
Proceedings of the 12th International Conference on Database Theory
Probabilistic databases: diamonds in the dirt
Communications of the ACM - Barbara Liskov: ACM's A.M. Turing Award Winner
Secondary-storage confidence computation for conjunctive queries with inequalities
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
MayBMS: a probabilistic database management system
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
The trichotomy of HAVING queries on a probabilistic database
The VLDB Journal — The International Journal on Very Large Data Bases
On chase termination beyond stratification
Proceedings of the VLDB Endowment
Bridging the gap between intensional and extensional query evaluation in probabilistic databases
Proceedings of the 13th International Conference on Extending Database Technology
Semantic query optimization in the presence of types
Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Computing query probability with incidence algebras
Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Threshold query optimization for uncertain data
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Combining intensional with extensional query evaluation in tuple independent probabilistic databases
Information Sciences: an International Journal
Read-once functions and query evaluation in probabilistic databases
Proceedings of the VLDB Endowment
Tractability in probabilistic databases
Proceedings of the 14th International Conference on Database Theory
Faster query answering in probabilistic databases using read-once functions
Proceedings of the 14th International Conference on Database Theory
Database foundations for scalable RDF processing
RW'11 Proceedings of the 7th international conference on Reasoning web: semantic technologies for the web of data
Probabilistic management of OCR data using an RDBMS
Proceedings of the VLDB Endowment
Local structure and determinism in probabilistic databases
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
H-Tree: a hybrid structure for confidence computation in probabilistic databases
APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
Probabilistic databases with MarkoViews
Proceedings of the VLDB Endowment
The dichotomy of probabilistic inference for unions of conjunctive queries
Journal of the ACM (JACM)
On the foundations of probabilistic information integration
Proceedings of the 21st ACM international conference on Information and knowledge management
Towards high-throughput gibbs sampling at scale: a study across storage managers
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Semantic query optimization in the presence of types
Journal of Computer and System Sciences
A compact representation for efficient uncertain-information integration
Proceedings of the 17th International Database Engineering & Applications Symposium
Oblivious bounds on the probability of boolean functions
ACM Transactions on Database Systems (TODS)
A temporal-probabilistic database model for information extraction
Proceedings of the VLDB Endowment
Anytime approximation in probabilistic databases
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
A paramount challenge in probabilistic databases is the scalable computation of confidences of tuples in query results. This paper introduces an efficient secondary-storage operator for exact computation of queries on tuple-independent probabilistic databases. We consider the conjunctive queries without self-joins that are known to be tractable on any tuple-independent database, and queries that are not tractable in general but become tractable on probabilistic databases restricted by functional dependencies. Our operator is semantically equivalent to a sequence of aggregations and can be naturally integrated into existing relational query plans. As a proof of concept, we developed an extension of the PostgreSQL 8.3.3 query engine called SPROUT. We study optimizations that push or pull our operator or parts thereof past joins. The operator employs static information, such as the query structure and functional dependencies, to decide which constituent aggregations can be evaluated together in one scan and how many scans are needed for the overall confidence computation task. A case study on the TPC-H benchmark reveals that most TPC-H queries obtained by removing aggregations can be evaluated efficiently using our operator. Experimental evaluation on probabilistic TPC-H data shows substantial efficiency improvements when compared to the state of the art.