Consistent query answers in inconsistent databases
PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Foundations of Databases: The Logical Level
Foundations of Databases: The Logical Level
Condensed Representation of Database Repairs for Consistent Query Answering
ICDT '03 Proceedings of the 9th International Conference on Database Theory
Computing consistent query answers using conflict hypergraphs
Proceedings of the thirteenth ACM international conference on Information and knowledge management
A cost-based model and effective heuristic for repairing constraints by value modification
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Database repairing using updates
ACM Transactions on Database Systems (TODS)
ULDBs: databases with uncertainty and lineage
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
First-order query rewriting for inconsistent databases
Journal of Computer and System Sciences
MCDB: a monte carlo approach to managing uncertain data
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Approximate Probabilistic Query Answering over Inconsistent Databases
ER '08 Proceedings of the 27th International Conference on Conceptual Modeling
Repair checking in inconsistent databases: algorithms and complexity
Proceedings of the 12th International Conference on Database Theory
On approximating optimum repairs for functional dependency violations
Proceedings of the 12th International Conference on Database Theory
Minimal-change integrity maintenance using tuple deletions
Information and Computation
ICDT'07 Proceedings of the 11th international conference on Database Theory
Improving XML data quality with functional dependencies
DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications - Volume Part I
Design by example for SQL table definitions with functional dependencies
The VLDB Journal — The International Journal on Very Large Data Bases
Declarative platform for data sourcing games
Proceedings of the 21st international conference on World Wide Web
DNIS'11 Proceedings of the 7th international conference on Databases in Networked Information Systems
Repairing XML functional dependency violations
Information Sciences: an International Journal
On repairing structural problems in semi-structured data
Proceedings of the VLDB Endowment
The LLUNATIC data-cleaning framework
Proceedings of the VLDB Endowment
Sampling from repairs of conditional functional dependency violations
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
Violations of functional dependencies (FDs) are common in practice, often arising in the context of data integration or Web data extraction. Resolving these violations is known to be challenging for a variety of reasons, one of them being the exponential number of possible "repairs". Previous work has tackled this problem either by producing a single repair that is (nearly) optimal with respect to some metric, or by computing consistent answers to selected classes of queries without explicitly generating the repairs. In this paper, we propose a novel data cleaning approach that is not limited to finding a single repair or to a particular class of queries, namely, sampling from the space of possible repairs. We give several motivating scenarios where sampling from the space of FD repairs is desirable, propose a new class of useful repairs, and present an algorithm that randomly samples from this space. We also show how to restrict the space of generated repairs based on user-defined hard constraints that define an immutable trusted subset of the input relation, and we experimentally evaluate our algorithm against previous approaches. While this paper focuses on repairing FDs, we envision the proposed sampling approach to be applicable to other integrity constraints with large repair spaces.