Causality and responsibility: probabilistic queries revisited in uncertain databases

Authors:
Xiang Lian;Lei Chen
Affiliations:
The University of Texas - Pan American, Edinburg, TX;The Hong Kong University of Sci. and Tech., Kowloon, Hong Kong, China
Venue:
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Year:
2013

Citing 28
Cited 0

The R*-tree: an efficient and robust access method for points and rectangles

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Tracing the lineage of view data in a warehousing environment

ACM Transactions on Database Systems (TODS)
On the 'Dimensionality Curse' and the 'Self-Similarity Blessing'

IEEE Transactions on Knowledge and Data Engineering
Evaluating probabilistic queries over imprecise data

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
GADT: A Probability Space ADT for Representing and Querying the Physical World

ICDE '02 Proceedings of the 18th International Conference on Data Engineering
Querying Imprecise Data in Moving Object Environments

IEEE Transactions on Knowledge and Data Engineering
MYSTIQ: a system for finding more answers by using probabilities

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
U-DBMS: a database system for managing constantly-evolving data

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Adaptive cleaning for RFID data streams

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
The new Casper: query processing for location services without compromising privacy

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
ULDBs: databases with uncertainty and lineage

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Efficient query evaluation on probabilistic databases

The VLDB Journal — The International Journal on Very Large Data Bases
Probabilistic skylines on uncertain data

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Ranking queries on uncertain data: a probabilistic threshold approach

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
MCDB: a monte carlo approach to managing uncertain data

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Efficient search for the top-k probable nearest neighbors in uncertain databases

Proceedings of the VLDB Endowment
BayesStore: managing large, uncertain data repositories with probabilistic graphical models

Proceedings of the VLDB Endowment
Cleaning uncertain data with quality guarantees

Proceedings of the VLDB Endowment
Database Support for Probabilistic Attributes and Tuples

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Efficient processing of probabilistic reverse nearest neighbor queries over uncertain data

The VLDB Journal — The International Journal on Very Large Data Bases
Responsibility and blame: a structural-model approach

Journal of Artificial Intelligence Research
Probabilistic Reverse Nearest Neighbor Queries on Uncertain Data

IEEE Transactions on Knowledge and Data Engineering
Probabilistic nearest-neighbor query on uncertain objects

DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
The complexity of causality and responsibility for query answers and non-answers

Proceedings of the VLDB Endowment
Tracing data errors with view-conditioned causality

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Sensitivity analysis and explanations for robust query evaluation in probabilistic databases

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Efficient probabilistic reverse nearest neighbor query processing on uncertain data

Proceedings of the VLDB Endowment
Causes and explanations: a structural-model approach: part i: causes

UAI'01 Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recently, due to ubiquitous data uncertainty in many real-life applications, it has become increasingly important to study efficient and effective processing of various probabilistic queries over uncertain data, which usually retrieve uncertain objects that satisfy query predicates with high probabilities. However, one annoying, yet challenging, problem is that, some probabilistic queries are very sensitive to low-quality objects in uncertain databases, and the returned query answers might miss some important results (due to low data quality). To identify both accurate query answers and those potentially low-quality objects, in this paper, we investigate the causes of query answers/non-answers from a novel angle of causality and responsibility (CR), and propose a new interpretation of probabilistic queries. Particularly, we focus on the problem of CR-based probabilistic nearest neighbor (CR-PNN) query, and design a general framework for answering CR-based queries (including CR-PNN), which can return both query answers with high confidences and low-quality objects that may potentially affect query results (for data cleaning purposes). To efficiently process CR-PNN queries, we propose effective pruning strategies to quickly filter out false alarms, and design efficient algorithms to obtain CR-PNN answers. Extensive experiments have been conducted to verify the efficiency and effectiveness of our proposed approaches.