Probabilistic reasoning in intelligent systems: networks of plausible inference
Probabilistic reasoning in intelligent systems: networks of plausible inference
Factoring and recognition of read-once functions using cographs and normality
Proceedings of the 38th annual Design Automation Conference
Probabilistic Networks and Expert Systems
Probabilistic Networks and Expert Systems
Introduction to Algorithms
Evaluating probabilistic queries over imprecise data
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Creating probabilistic databases from information extraction models
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Efficient query evaluation on probabilistic databases
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Data integration with uncertainty
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Event queries on correlated probabilistic streams
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Approximate lineage for probabilistic databases
Proceedings of the VLDB Endowment
Exploiting shared correlations in probabilistic databases
Proceedings of the VLDB Endowment
Fast and Simple Relational Processing of Uncertain Data
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Exploiting Lineage for Confidence Computation in Uncertain and Probabilistic Databases
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Online Filtering, Smoothing and Probabilistic Modeling of Streaming data
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Access Methods for Markovian Streams
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Ef?cient Query Evaluation over Temporally Correlated Probabilistic Streams
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Uncertainty management in rule-based information extraction systems
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
Indexing correlated probabilistic databases
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
An edge deletion semantics for belief propagation and its practical impact on approximation quality
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Thin junction tree filters for simultaneous localization and mapping
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Sensitivity analysis and explanations for robust query evaluation in probabilistic databases
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Lineage for Markovian stream event queries
Proceedings of the 10th ACM International Workshop on Data Engineering for Wireless and Mobile Access
Database foundations for scalable RDF processing
RW'11 Proceedings of the 7th international conference on Reasoning web: semantic technologies for the web of data
Interactive reasoning in uncertain RDF knowledge bases
Proceedings of the 20th ACM international conference on Information and knowledge management
Local structure and determinism in probabilistic databases
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Probabilistic databases with MarkoViews
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
In this paper, we address the problem of scalably evaluating conjunctive queries over correlated probabilistic databases containing tuple or attribute uncertainties. Like previous work, we adopt a two-phase approach where we first compute lineages of the output tuples, and then compute the probabilities of the lineage formulas. However unlike previous work, we allow for arbitrary and complex correlations to be present in the data, captured via a forest of junction trees. We observe that evaluating even read-once (tree structured) lineages (e.g., those generated by hierarchical conjunctive queries), polynomially computable over tuple independent probabilistic databases, is #P-complete for lightly correlated probabilistic databases like Markov sequences. We characterize the complexity of exact computation of the probability of the lineage formula on a correlated database using a parameter called lwidth (analogous to the notion of treewidth). For lineages that result in low lwidth, we compute exact probabilities using a novel message passing algorithm, and for lineages that induce large lwidths, we develop approximate Monte Carlo algorithms to estimate the result probabilities. We scale our algorithms to very large correlated probabilistic databases using the previously proposed INDSEP data structure. To mitigate the complexity of lineage evaluation, we develop optimization techniques to process a batch of lineages by sharing computation across formulas, and to exploit any independence relationships that may exist in the data. Our experimental study illustrates the benefits of using our algorithms for processing lineage formulas over correlated probabilistic databases.