STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
The impact of poor data quality on the typical enterprise
Communications of the ACM
Foundations of Databases: The Logical Level
Foundations of Databases: The Logical Level
Active Database Systems: Triggers and Rules for Advanced Database Processing
Active Database Systems: Triggers and Rules for Advanced Database Processing
Potter's Wheel: An Interactive Data Cleaning System
Proceedings of the 27th International Conference on Very Large Data Bases
Robust and efficient fuzzy match for online data cleaning
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
Answer sets for consistent query answering in inconsistent databases
Theory and Practice of Logic Programming
A cost-based model and effective heuristic for repairing constraints by value modification
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Database repairing using updates
ACM Transactions on Database Systems (TODS)
Data Quality: Concepts, Methodologies and Techniques (Data-Centric Systems and Applications)
Data Quality: Concepts, Methodologies and Techniques (Data-Centric Systems and Applications)
Duplicate Record Detection: A Survey
IEEE Transactions on Knowledge and Data Engineering
Extending dependencies with conditions
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Improving data quality: consistency and accuracy
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Conditional functional dependencies for capturing data inconsistencies
ACM Transactions on Database Systems (TODS)
Dependencies revisited for improving data quality
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
On generating near-optimal tableaux for conditional functional dependencies
Proceedings of the VLDB Endowment
Discovering data quality rules
Proceedings of the VLDB Endowment
Master Data Management
On approximating optimum repairs for functional dependency violations
Proceedings of the 12th International Conference on Database Theory
Data Quality and Record Linkage Techniques
Data Quality and Record Linkage Techniques
Computational Complexity: A Modern Approach
Computational Complexity: A Modern Approach
Swoosh: a generic approach to entity resolution
The VLDB Journal — The International Journal on Very Large Data Bases
Increasing the Expressivity of Conditional Functional Dependencies without Extra Complexity
ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
The Art of Computer Programming, Volume 4, Fascicle 1: Bitwise Tricks & Techniques; Binary Decision Diagrams
Analyses and Validation of Conditional Dependencies with Built-in Predicates
DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
Discovering matching dependencies
Proceedings of the 18th ACM conference on Information and knowledge management
Minimal-change integrity maintenance using tuple deletions
Information and Computation
Towards certain fixes with editing rules and master data
Proceedings of the VLDB Endowment
Record linkage with uniqueness constraints and erroneous values
Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment
Interaction between record matching and data repairing
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Dynamic constraints for record matching
The VLDB Journal — The International Journal on Very Large Data Bases
The data analytics group at the qatar computing research institute
ACM SIGMOD Record
NADEEF: a commodity data cleaning system
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
NADEEF: a generalized data cleaning system
Proceedings of the VLDB Endowment
Extending inclusion dependencies with conditions
Theoretical Computer Science
Sampling from repairs of conditional functional dependency violations
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
A variety of integrity constraints have been studied for data cleaning. While these constraints can detect the presence of errors, they fall short of guiding us to correct the errors. Indeed, data repairing based on these constraints may not find certain fixes that are guaranteed correct, and worse still, may even introduce new errors when attempting to repair the data. We propose a method for finding certain fixes, based on master data, a notion of certain regions, and a class of editing rules. A certain region is a set of attributes that are assured correct by the users. Given a certain region and master data, editing rules tell us what attributes to fix and how to update them. We show how the method can be used in data monitoring and enrichment. We also develop techniques for reasoning about editing rules, to decide whether they lead to a unique fix and whether they are able to fix all the attributes in a tuple, relative to master data and a certain region. Furthermore, we present a framework and an algorithm to find certain fixes, by interacting with the users to ensure that one of the certain regions is correct. We experimentally verify the effectiveness and scalability of the algorithm.