Approximating probabilistic inference in Bayesian belief networks is NP-hard
Artificial Intelligence
Factoring and recognition of read-once functions using cographs and normality
Proceedings of the 38th annual Design Automation Conference
Why and Where: A Characterization of Data Provenance
ICDT '01 Proceedings of the 8th International Conference on Database Theory
Causes and Explanations: A Structural-Model Approach: Part 1: Causes
UAI '01 Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence
Evaluating probabilistic queries over imprecise data
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Working Models for Uncertain Data
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Creating probabilistic databases from information extraction models
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Proceedings of the twenty-sixth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Efficient aggregation algorithms for probabilistic data
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
Efficient query evaluation on probabilistic databases
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Data integration with uncertainty
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Computational Geometry: Algorithms and Applications
Computational Geometry: Algorithms and Applications
MCDB: a monte carlo approach to managing uncertain data
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
On the complexity of deriving schema mappings from database instances
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Cleaning uncertain data with quality guarantees
Proceedings of the VLDB Endowment
Approximate lineage for probabilistic databases
Proceedings of the VLDB Endowment
Fast and Simple Relational Processing of Uncertain Data
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Provenance in Databases: Why, How, and Where
Foundations and Trends in Databases
Causes and explanations: a structural-model approach-part II: explanations
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Sensitivity analysis in Markov networks
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
PrDB: managing and exploiting rich correlations in probabilistic databases
The VLDB Journal — The International Journal on Very Large Data Bases
Lineage processing over correlated probabilistic databases
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
SystemT: an algebraic approach to declarative information extraction
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
The complexity of causality and responsibility for query answers and non-answers
Proceedings of the VLDB Endowment
Read-once functions and query evaluation in probabilistic databases
Proceedings of the VLDB Endowment
Treewidth in verification: local vs. global
LPAR'05 Proceedings of the 12th international conference on Logic for Programming, Artificial Intelligence, and Reasoning
Probabilistic techniques for obtaining accurate patient counts in Clinical Data Warehouses
Journal of Biomedical Informatics
Aggregation in probabilistic databases via knowledge compilation
Proceedings of the VLDB Endowment
H-Tree: a hybrid structure for confidence computation in probabilistic databases
APWeb'12 Proceedings of the 14th Asia-Pacific international conference on Web Technologies and Applications
A demonstration of DBWipes: clean as you query
Proceedings of the VLDB Endowment
Causality and responsibility: probabilistic queries revisited in uncertain databases
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Scorpion: explaining away outliers in aggregate queries
Proceedings of the VLDB Endowment
Anytime approximation in probabilistic databases
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
Probabilistic database systems have successfully established themselves as a tool for managing uncertain data. However, much of the research in this area has focused on efficient query evaluation and has largely ignored two key issues that commonly arise in uncertain data management: First, how to provide explanations for query results, e.g., Why is this tuple in my result? or Why does this output tuple have such high probability?. Second, the problem of determining the sensitive input tuples for the given query, e.g., users are interested to know the input tuples that can substantially alter the output, when their probabilities are modified (since they may be unsure about the input probability values). Existing systems provide the lineage/provenance of each of the output tuples in addition to the output probabilities, which is a boolean formula indicating the dependence of the output tuple on the input tuples. However, lineage does not immediately provide a quantitative relationship and it is not informative when we have multiple output tuples. In this paper, we propose a unified framework that can handle both the issues mentioned above to facilitate robust query processing. We formally define the notions of influence and explanations and provide algorithms to determine the top-l influential set of variables and the top-l set of explanations for a variety of queries, including conjunctive queries, probabilistic threshold queries, top-k queries and aggregation queries. Further, our framework naturally enables highly efficient incremental evaluation when input probabilities are modified (e.g., if uncertainty is resolved). Our preliminary experimental results demonstrate the benefits of our framework for performing robust query processing over probabilistic databases.