Optimizing mpf queries: decision support and probabilistic inference

Authors:
Héctor Corrada Bravo;Raghu Ramakrishnan
Affiliations:
University of Wisconsin-Madison, Madison, WI;Yahoo! Research, Santa Clara, CA
Venue:
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Year:
2007

Citing 19
Cited 14

A probabilistic relational algebra for the integration of information retrieval and database systems

ACM Transactions on Information Systems (TOIS)
Bayesian Networks and Decision Graphs

Bayesian Networks and Decision Graphs
Probabilistic Networks and Expert Systems

Probabilistic Networks and Expert Systems
The Relational Structure of Belief Networks

Journal of Intelligent Information Systems
Access path selection in a relational database management system

SIGMOD '79 Proceedings of the 1979 ACM SIGMOD international conference on Management of data
Optimizing Queries with Aggregate Views

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Including Group-By in Query Optimization

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Learning Probabilistic Relational Models

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Answering queries from statistics and probabilistic views

VLDB '05 Proceedings of the 31st international conference on Very large data bases
OLAP over uncertain and imprecise data

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Correctness of Local Probability Propagation in Graphical Models with Loops

Neural Computation
Efficient query evaluation on probabilistic databases

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Discriminative training of Markov logic networks

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Exploiting causal independence in Bayesian network inference

Journal of Artificial Intelligence Research
Probabilistic reasoning in Bayesian networks: a relational database approach

AI'03 Proceedings of the 16th Canadian society for computational studies of intelligence conference on Advances in artificial intelligence
A method for implementing a probabilistic model as a relational database

UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
An evaluation of structural parameters for probabilistic reasoning: results on benchmark circuits

UAI'96 Proceedings of the Twelfth international conference on Uncertainty in artificial intelligence
The generalized distributive law

IEEE Transactions on Information Theory
Factor graphs and the sum-product algorithm

IEEE Transactions on Information Theory

Probabilistic processing of interval-valued sensor data

Proceedings of the 5th workshop on Data management for sensor networks
Exploiting shared correlations in probabilistic databases

Proceedings of the VLDB Endowment
Generating efficient safe query plans for probabilistic databases

Data & Knowledge Engineering
Indexing correlated probabilistic databases

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
PrDB: managing and exploiting rich correlations in probabilistic databases

The VLDB Journal — The International Journal on Very Large Data Bases
Indexing forecast models for matching and maintenance

Proceedings of the Fourteenth International Database Engineering & Applications Symposium
Combining intensional with extensional query evaluation in tuple independent probabilistic databases

Information Sciences: an International Journal
Database-support for continuous prediction queries over streaming data

Proceedings of the VLDB Endowment
Latent OLAP: data cubes over latent variables

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Marginalization without summation: exploiting determinism in factor algebra

ECSQARU'11 Proceedings of the 11th European conference on Symbolic and quantitative approaches to reasoning with uncertainty
Database foundations for scalable RDF processing

RW'11 Proceedings of the 7th international conference on Reasoning web: semantic technologies for the web of data
Interactive reasoning in uncertain RDF knowledge bases

Proceedings of the 20th ACM international conference on Information and knowledge management
Efficiently adapting graphical models for selectivity estimation

The VLDB Journal — The International Journal on Very Large Data Bases
Probabilistic inference of object identifications for event stream analytics

Proceedings of the 16th International Conference on Extending Database Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

Managing uncertain data using probabilistic frameworks has attracted much interest lately in the database literature, and a central computational challenge is probabilistic inference. This paper presents a broad class of aggregate queries, called MPF queries, inspired by the literature on probabilistic inference in statistics and machine learning. An MPF (Marginalize a Product Function) query is an aggregate query over a stylized join of several relations. In probabilistic inference, this join corresponds to taking the product of several probability distributions, while the aggregate operation corresponds to marginalization. Probabilistic inference can be expressed directly as MPF queries in a relational setting, and therefore, by optimizing evaluation of MPF queries, we provide scalable support for probabilistic inference in database systems. To optimize MPF queries, we build on ideas from database query optimization as well as traditional algorithms such as Variable Elimination and Belief Propagation from the probabilistic inference literature. Although our main motivation for introducing MPF queries is to support easy expression and efficient evaluation of probabilistic inference in a DBMS, we observe that this class of queries is very useful for a range of decision support tasks. We present and optimize MPF queries in a general form where arbitrary functions (i.e., other than probability distributions) are handled, and demonstrate their value for decision support applications through a number of illustrative and natural examples.