Efficient algorithms for combinatorial problems on graphs with bounded, decomposability—a survey
BIT - Ellis Horwood series in artificial intelligence
Three partition refinement algorithms
SIAM Journal on Computing
A probabilistic relational algebra for the integration of information retrieval and database systems
ACM Transactions on Information Systems (TOIS)
Learning Probabilistic Relational Models
IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
CCS expressions, finite state processes, and three problems of equivalence
PODC '83 Proceedings of the second annual ACM symposium on Principles of distributed computing
Evaluating probabilistic queries over imprecise data
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Efficient reasoning in graphical models
Efficient reasoning in graphical models
Learning probabilistic models of link structure
The Journal of Machine Learning Research
Clean Answers over Dirty Databases: A Probabilistic Approach
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
ULDBs: databases with uncertainty and lineage
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Creating probabilistic databases from information extraction models
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Optimizing mpf queries: decision support and probabilistic inference
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning)
Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning)
Model-driven data acquisition in sensor networks
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Efficient query evaluation on probabilistic databases
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Materialized views in probabilistic databases: for information exchange and query optimization
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Exploiting Lineage for Confidence Computation in Uncertain and Probabilistic Databases
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Database Support for Probabilistic Attributes and Tuples
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
First-order probabilistic inference
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Lifted first-order probabilistic inference
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Consensus answers for queries over probabilistic databases
Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Uncertainty management in rule-based information extraction systems
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Ranking distributed probabilistic data
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Indexing correlated probabilistic databases
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
PrDB: managing and exploiting rich correlations in probabilistic databases
The VLDB Journal — The International Journal on Very Large Data Bases
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
A flexible framework for multisensor data fusion using data stream management technologies
Proceedings of the 2009 EDBT/ICDT Workshops
Bisimulation-based approximate lifted inference
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
PODS: a new model and processing algorithms for uncertain data streams
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Lineage processing over correlated probabilistic databases
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
A generic framework for handling uncertain data with local correlations
Proceedings of the VLDB Endowment
Speeding up inference in statistical relational learning by clustering similar query literals
ILP'09 Proceedings of the 19th international conference on Inductive logic programming
Scalable probabilistic databases with factor graphs and MCMC
Proceedings of the VLDB Endowment
Learning statistical models from relational data
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Decision-theoretic planning with generalized first-order decision diagrams
Artificial Intelligence
Towards a unified architecture for in-RDBMS analytics
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Aggregate queries on probabilistic record linkages
Proceedings of the 15th International Conference on Extending Database Technology
CLARO: modeling and processing uncertain data streams
The VLDB Journal — The International Journal on Very Large Data Bases
Supporting user-defined functions on uncertain data
Proceedings of the VLDB Endowment
Approximation trade-offs in a Markovian stream warehouse: An empirical study
Information Systems
Hi-index | 0.00 |
There has been a recent surge in work in probabilistic databases, propelled in large part by the huge increase in noisy data sources --- from sensor data, experimental data, data from uncurated sources, and many others. There is a growing need for database management systems that can efficiently represent and query such data. In this work, we show how data characteristics can be leveraged to make the query evaluation process more efficient. In particular, we exploit what we refer to as shared correlations where the same uncertainties and correlations occur repeatedly in the data. Shared correlations occur mainly due to two reasons: (1) Uncertainty and correlations usually come from general statistics and rarely vary on a tuple-to-tuple basis; (2) The query evaluation procedure itself tends to re-introduce the same correlations. Prior work has shown that the query evaluation problem on probabilistic databases is equivalent to a probabilistic inference problem on an appropriately constructed probabilistic graphical model (PGM). We leverage this by introducing a new data structure, called the random variable elimination graph (rv-elim graph) that can be built from the PGM obtained from query evaluation. We develop techniques based on bisimulation that can be used to compress the rv-elim graph exploiting the presence of shared correlations in the PGM, the compressed rv-elim graph can then be used to run inference. We validate our methods by evaluating them empirically and show that even with a few shared correlations significant speed-ups are possible.