Mining quantitative association rules in large relational tables
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Data Quality: The Accuracy Dimension
Data Quality: The Accuracy Dimension
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Exploratory Data Mining and Data Cleaning
Exploratory Data Mining and Data Cleaning
Mining Strong Affinity Association Patterns in Data Sets with Skewed Support Distribution
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Quality-driven query answering for integrated information systems
Quality-driven query answering for integrated information systems
Describing differences between databases
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Data quality awareness: a case study for cost optimal association rule mining
Knowledge and Information Systems - Special Issue on Mining Low-Quality Data
QDex: a database profiler for generic bio-data exploration and quality aware integration
WISE'07 Proceedings of the 2007 international conference on Web information systems engineering
Improving data quality by source analysis
Journal of Data and Information Quality (JDIQ)
Discovering conditional inclusion dependencies
Proceedings of the 21st ACM international conference on Information and knowledge management
Hi-index | 0.00 |
Information integration is often faced with the problem that different data sources represent the same set of the real-world objects, but give conflicting values for specific properties of these objects. Within this paper we present a model of such conflicts and describe an algorithm for efficiently detecting patterns of conflicts in a pair of overlapping data sources. The contradiction patterns we can find are a special kind of association rules, describing regularities in conflicts occurring together with certain attribute values, paris of attribute values, or with other conflicts. Therefore, we adapt existing association rule mining algorithms for mining contradiction patterns. Such patterns are an important tool for human experts that try to find and resolve problems in data quality using domain knowledge. We present the results of applying our method on a real world data set from the life science domain and show how it helps to generate clean data for integrated data warehouses.