Potter's Wheel: An Interactive Data Cleaning System
Proceedings of the 27th International Conference on Very Large Data Bases
A cost-based model and effective heuristic for repairing constraints by value modification
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Proceedings of the VLDB Endowment
Interaction between record matching and data repairing
Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Towards certain fixes with editing rules and master data
The VLDB Journal — The International Journal on Very Large Data Bases
NADEEF: a commodity data cleaning system
Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data
Holistic data cleaning: Putting violations into context
ICDE '13 Proceedings of the 2013 IEEE International Conference on Data Engineering (ICDE 2013)
Hi-index | 0.00 |
We present NADEEF, an extensible, generic and easy-to-deploy data cleaning system. NADEEF distinguishes between a programming interface and a core to achieve generality and extensibility. The programming interface allows users to specify data quality rules by writing code that implements predefined classes. These classes uniformly define what is wrong with the data and (possibly) how to fix it. We will demonstrate the following features provided by NADEEF. (1) Heterogeneity: The programming interface can be used to express many types of data quality rules beyond the well known CFDs (FDs), MDs and ETL rules. (2) Interdependency: The core algorithms can interleave multiple types of rules to detect and repair data errors. (3) Deployment and extensibility: Users can easily customize NADEEF by defining new types of rules, or by extending the core. (4) Metadata management and data custodians: We show a live data quality dashboard to effectively involve users in the data cleaning process.