Declarative Data Cleaning: Language, Model, and Algorithms
Proceedings of the 27th International Conference on Very Large Data Bases
Potter's Wheel: An Interactive Data Cleaning System
Proceedings of the 27th International Conference on Very Large Data Bases
A Logical Framework for Querying and Repairing Inconsistent Databases
IEEE Transactions on Knowledge and Data Engineering
A cost-based model and effective heuristic for repairing constraints by value modification
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Data Quality: Concepts, Methodologies and Techniques (Data-Centric Systems and Applications)
Data Quality: Concepts, Methodologies and Techniques (Data-Centric Systems and Applications)
Improving data quality: consistency and accuracy
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Pay-as-you-go user feedback for dataspace systems
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Data Quality and Record Linkage Techniques
Data Quality and Record Linkage Techniques
Foundations of Data Quality Management
Foundations of Data Quality Management
Hi-index | 0.00 |
In this paper, we present our design of a data cleaning framework that combines interaction of data quality rules (CFDS, CINDS and MDs) with user feedback through an interactive process. First, to generate candidate repairs for each potentially dirty attribute, we propose an optimization model based on genetic algorithm. We then create a Bayesian machine learning model with several committees to predict the correctness of the repair and rank these repairs by uncertainly score to improve the learned model. User feedback is used to decide whether the model is accurate while inspecting the suggestions. Finally, our experiments on real-world datasets show significant improvement in data quality.