Consistent query answers in inconsistent probabilistic databases

Authors:
Xiang Lian;Lei Chen;Shaoxu Song
Affiliations:
The Hong Kong University of Science and Technology, Hong Kong, Hong Kong;The Hong Kong University of Science and Technology, Hong Kong, Hong Kong;The Hong Kong University of Science and Technology, Hong Kong, Hong Kong
Venue:
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Year:
2010

Citing 35
Cited 4

The R*-tree: an efficient and robust access method for points and rectangles

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
A model for the prediction of R-tree performance

PODS '96 Proceedings of the fifteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Consistent query answers in inconsistent databases

PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem

Data Mining and Knowledge Discovery
Reliability of Answers to Queries in Relational Databases

IEEE Transactions on Knowledge and Data Engineering
Spatial Joins Using R-trees: Breadth-First Traversal with Global Optimizations

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Evaluating probabilistic queries over imprecise data

Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Methods for evaluating and creating data quality

Information Systems - Special issue: Data quality in cooperative information systems
A cost-based model and effective heuristic for repairing constraints by value modification

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
ConQuer: efficient management of inconsistent databases

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
MYSTIQ: a system for finding more answers by using probabilities

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
REED: robust, efficient filtering and event detection in sensor networks

VLDB '05 Proceedings of the 31st international conference on Very large data bases
U-DBMS: a database system for managing constantly-evolving data

VLDB '05 Proceedings of the 31st international conference on Very large data bases
Database repairing using updates

ACM Transactions on Database Systems (TODS)
Clean Answers over Dirty Databases: A Probabilistic Approach

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
ULDBs: databases with uncertainty and lineage

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Branch-and-bound processing of ranked queries

Information Systems
Efficient query evaluation on probabilistic databases

The VLDB Journal — The International Journal on Very Large Data Bases
Merging models based on given correspondences

VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Improving data quality: consistency and accuracy

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Ranking queries on uncertain data: a probabilistic threshold approach

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
MCDB: a monte carlo approach to managing uncertain data

Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Dependencies revisited for improving data quality

Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
BayesStore: managing large, uncertain data repositories with probabilistic graphical models

Proceedings of the VLDB Endowment
Top-k aggregation using intersections of ranked inputs

Proceedings of the Second ACM International Conference on Web Search and Data Mining
Efficient Processing of Top-k Queries in Uncertain Databases

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Exploiting Lineage for Confidence Computation in Uncertain and Probabilistic Databases

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Semantics of Ranking Queries for Probabilistic Data and Expected Ranks

ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Distributed top-k aggregation queries at large

Distributed and Parallel Databases
Canopy closure estimates with GreenOrbs: sustainable sensing in the forest

Proceedings of the 7th ACM Conference on Embedded Networked Sensor Systems
A unified approach to ranking in probabilistic databases

Proceedings of the VLDB Endowment
Integrating conflicting data: the role of source dependence

Proceedings of the VLDB Endowment
Truth discovery and copying detection in a dynamic world

Proceedings of the VLDB Endowment
Modeling and querying possible repairs in duplicate detection

Proceedings of the VLDB Endowment
Minimal-change integrity maintenance using tuple deletions

Information and Computation

Development of foundation models for Internet of Things

Frontiers of Computer Science in China
Cost-efficient repair in inconsistent probabilistic databases

Proceedings of the 20th ACM international conference on Information and knowledge management
Scrubbing query results from probabilistic databases

Proceedings of the 15th Symposium on International Database Engineering & Applications
Consistent answers in probabilistic datalog+/--- ontologies

RR'12 Proceedings of the 6th international conference on Web Reasoning and Rule Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Efficient and effective manipulation of probabilistic data has become increasingly important recently due to many real applications that involve the data uncertainty. This is especially crucial when probabilistic data collected from different sources disagree with each other and incur inconsistencies. In order to accommodate such inconsistencies and enable consistent query answering (CQA), in this paper, we propose the all-possible-repair semantics in the context of inconsistent probabilistic databases, which formalize the repairs on the database as repair worlds via a graph representation. In turn, the CQA problem can be converted into one in the so-called repaired possible worlds (w.r.t. both repair worlds and possible worlds). We investigate a series of consistent queries in inconsistent probabilistic databases, including consistent range queries, join, and top-k queries, which, however, need to deal with an exponential number of the repaired possible worlds at high cost. To tackle the efficiency problem of CQA, in this paper, we propose efficient approaches for retrieving consistent query answers, including effective pruning methods to filter out false positives. Extensive experiments have been conducted to demonstrate the efficiency and effectiveness of our approaches.