On generating all maximal independent sets
Information Processing Letters
The impact of poor data quality on the typical enterprise
Communications of the ACM
Consistent query answers in inconsistent databases
PODS '99 Proceedings of the eighteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Approximation algorithms
Active Database Systems: Triggers and Rules for Advanced Database Processing
Active Database Systems: Triggers and Rules for Advanced Database Processing
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Approximating Minimum Keys and Optimal Substructure Screens
COCOON '96 Proceedings of the Second Annual International Conference on Computing and Combinatorics
Robust and efficient fuzzy match for online data cleaning
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
A cost-based model and effective heuristic for repairing constraints by value modification
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Database repairing using updates
ACM Transactions on Database Systems (TODS)
Extending dependencies with conditions
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Improving data quality: consistency and accuracy
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Conditional functional dependencies for capturing data inconsistencies
ACM Transactions on Database Systems (TODS)
Dependencies revisited for improving data quality
Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
On generating near-optimal tableaux for conditional functional dependencies
Proceedings of the VLDB Endowment
Discovering data quality rules
Proceedings of the VLDB Endowment
Master Data Management
On approximating optimum repairs for functional dependency violations
Proceedings of the 12th International Conference on Database Theory
Data Quality and Record Linkage Techniques
Data Quality and Record Linkage Techniques
Reasoning about record matching rules
Proceedings of the VLDB Endowment
Minimal-change integrity maintenance using tuple deletions
Information and Computation
Proceedings of the VLDB Endowment
Interaction between record matching and data repairing
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Towards certain fixes with editing rules and master data
The VLDB Journal — The International Journal on Very Large Data Bases
Don't be SCAREd: use SCalable Automatic REpairing with maximal likelihood and bounded changes
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Efficient filtering and ranking schemes for finding inclusion dependencies on the web
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Discovering meaning on the go in large heterogenous data
Artificial Intelligence Review
The LLUNATIC data-cleaning framework
Proceedings of the VLDB Endowment
Detecting mistakes in binary data tables
Automatic Documentation and Mathematical Linguistics
Sampling from repairs of conditional functional dependency violations
The VLDB Journal — The International Journal on Very Large Data Bases
Hi-index | 0.00 |
A variety of integrity constraints have been studied for data cleaning. While these constraints can detect the presence of errors, they fall short of guiding us to correct the errors. Indeed, data repairing based on these constraints may not find certain fixes that are absolutely correct, and worse, may introduce new errors when repairing the data. We propose a method for finding certain fixes, based on master data, a notion of certain regions, and a class of editing rules. A certain region is a set of attributes that are assured correct by the users. Given a certain region and master data, editing rules tell us what attributes to fix and how to update them. We show how the method can be used in data monitoring and enrichment. We develop techniques for reasoning about editing rules, to decide whether they lead to a unique fix and whether they are able to fix all the attributes in a tuple, relative to master data and a certain region. We also provide an algorithm to identify minimal certain regions, such that a certain fix is warranted by editing rules and master data as long as one of the regions is correct. We experimentally verify the effectiveness and scalability of the algorithm.