Reverse engineering of relational databases: extraction of an EER model from a relational database
Data & Knowledge Engineering
Communications of the ACM
Data mining: concepts and techniques
Data mining: concepts and techniques
Duplicate Record Detection: A Survey
IEEE Transactions on Knowledge and Data Engineering
Data Mining and Knowledge Discovery
Eliminating fuzzy duplicates in data warehouses
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Open business intelligence: on the importance of data quality awareness in user-friendly data mining
Proceedings of the 2012 Joint EDBT/ICDT Workshops
Hi-index | 0.00 |
A successful data mining process depends on the data quality of the sources in order to obtain reliable knowledge. Therefore, preprocessing data is required for dealing with data quality criteria. However, preprocessing data has been traditionally seen as a time-consuming and non-trivial task since data quality criteria have to be considered without any guide about how they affect the data mining process. To overcome this situation, in this paper, we propose to analyze the data mining techniques to know the behavior of different data quality criteria on the sources and how they affects the results of the algorithms. To this aim, we have conducted a set of experiments to assess three data quality criteria: completeness, correlation and balance of data. This work is a first step towards considering, in a systematic and structured manner, data quality criteria for supporting and guiding data miners in obtaining reliable knowledge.