Probabilistic databases with MarkoViews

Authors:
Abhay Jha;Dan Suciu
Affiliations:
University of Washington, Seattle, WA;University of Washington, Seattle, WA
Venue:
Proceedings of the VLDB Endowment
Year:
2012

Citing 27
Cited 2

Counting the number of solutions for instances of satisfiability

Theoretical Computer Science
Symbolic manipulation of Boolean functions using a graphical representation

DAC '85 Proceedings of the 22nd ACM/IEEE Design Automation Conference
MYSTIQ: a system for finding more answers by using probabilities

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Learning the structure of Markov logic networks

ICML '05 Proceedings of the 22nd international conference on Machine learning
Markov logic networks

Machine Learning
Entity Resolution with Markov Logic

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Efficient query evaluation on probabilistic databases

The VLDB Journal — The International Journal on Very Large Data Bases
Using OBDDs for Efficient Query Evaluation on Probabilistic Databases

SUM '08 Proceedings of the 2nd international conference on Scalable Uncertainty Management
Conditioning probabilistic databases

Proceedings of the VLDB Endowment
Fast and Simple Relational Processing of Uncertain Data

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
SPROUT: Lazy vs. Eager Query Plans for Tuple-Independent Probabilistic Databases

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Secondary-storage confidence computation for conjunctive queries with inequalities

Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Sound and efficient inference with probabilistic and deterministic dependencies

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Discriminative training of Markov logic networks

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Joint inference in information extraction

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
The good old Davis-Putnam procedure helps counting models

Journal of Artificial Intelligence Research
BDDs-design, analysis, complexity, and applications

Discrete Applied Mathematics
Computing query probability with incidence algebras

Proceedings of the twenty-ninth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Lineage processing over correlated probabilistic databases

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Markov Logic: An Interface Layer for Artificial Intelligence

Markov Logic: An Interface Layer for Artificial Intelligence
Unsupervised ontology induction from text

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Scalable probabilistic databases with factor graphs and MCMC

Proceedings of the VLDB Endowment
Knowledge compilation meets database theory: compiling queries to decision diagrams

Proceedings of the 14th International Conference on Database Theory
Tuffy: scaling up statistical inference in Markov logic networks using an RDBMS

Proceedings of the VLDB Endowment
Hybrid in-database inference for declarative information extraction

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Probabilistic Databases

Probabilistic Databases
Using DPLL for efficient OBDD construction

SAT'04 Proceedings of the 7th international conference on Theory and Applications of Satisfiability Testing

DAGger: clustering correlated uncertain data (to predict asset failure in energy networks)

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Programming with personalized pagerank: a locally groundable first-order probabilistic logic

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most of the work on query evaluation in probabilistic databases has focused on the simple tuple-independent data model, where tuples are independent random events. Several efficient query evaluation techniques exists in this setting, such as safe plans, algorithms based on OBDDs, tree-decomposition and a variety of approximation algorithms. However, complex data analytics tasks often require complex correlations, and query evaluation then is significantly more expensive, or more restrictive. In this paper, we propose MVDB as a framework both for representing complex correlations and for efficient query evaluation. An MVDB specifies correlations by views, called MarkoViews, on the probabilistic relations and declaring the weights of the view's outputs. An MVDB is a (very large) Markov Logic Network. We make two sets of contributions. First, we show that query evaluation on an MVDB is equivalent to evaluating a Union of Conjunctive Query(UCQ) over a tuple-independent database. The translation is exact (thus allowing the techniques developed for tuple independent databases to be carried over to MVDB), yet it is novel and quite non-obvious (some resulting probabilities may be negative!). This translation in itself though may not lead to much gain since the translated query gets complicated as we try to capture more correlations. Our second contribution is to propose a new query evaluation strategy that exploits offline compilation to speed up online query evaluation. Here we utilize and extend our prior work on compilation of UCQ. We validate experimentally our techniques on a large probabilistic database with MarkoViews inferred from the DBLP data.