Statistical analysis with missing data
Statistical analysis with missing data
Data quality: management and technology
Data quality: management and technology
Constraints and databases
LOF: identifying density-based local outliers
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Data quality: the field guide
Expert Systems: Principles and Programming
Expert Systems: Principles and Programming
Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem
Data Mining and Knowledge Discovery
Algorithms for Mining Distance-Based Outliers in Large Datasets
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Exploratory Data Mining and Data Cleaning
Exploratory Data Mining and Data Cleaning
Using inheritance in a metadata based approach to data quality assessment
Proceedings of the first international workshop on Model driven service engineering and data quality and security
An extensible metadata framework for data quality assessment of composite structures
DaWaK'07 Proceedings of the 9th international conference on Data Warehousing and Knowledge Discovery
Hi-index | 0.00 |
Traditionally, data quality programs have acted as a preprocessing stage to make data suitable for a data mining or analysis operation. Recently, data quality concepts have been applied to databases that support business operations such as provisioning and billing. Incorporating business rules that drive operations and their associated data processes is critically important to the success of such projects. However, there are many practical complications. For example, documentation on business rules is often meager. Rules change frequently. Domain knowledge is often fragmented across experts, and those experts do not always agree. Typically, rules have to be gathered from subject matter experts iteratively, and are discovered out of logical or procedural sequence, like a jigsaw puzzle. Our approach is to impement business rules as constraints on data in a classical expert system formalism sometimes called production rules. Our system works by allowing good data to pass through a system of constraints unchecked. Bad data violate constraints and are flagged, and then fed back after correction. Constraints are added incrementally as better understanding of the business rules is gained. We include a real-life case study.