Evaluating noise correction

Authors:
Choh Man Teng
Affiliations:
School of Computer Science and Engineering, University of New South Wales, Sydney, NSW, Australia
Venue:
PRICAI'00 Proceedings of the 6th Pacific Rim international conference on Artificial intelligence
Year:
2000

Citing 8
Cited 6

Structured induction in expert systems

Structured induction in expert systems
Robust regression and outlier detection

Robust regression and outlier detection
Simplifying decision trees

International Journal of Man-Machine Studies - Special Issue: Knowledge Acquisition for Knowledge-based Systems. Part 5
C4.5: programs for machine learning

C4.5: programs for machine learning
The CN2 Induction Algorithm

Machine Learning
Correcting Noisy Data

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Noise Elimination in Inductive Concept Learning: A Case Study in Medical Diagnosois

ALT '96 Proceedings of the 7th International Workshop on Algorithmic Learning Theory
Identifying and eliminating mislabeled training instances

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1

Polishing Blemishes: Issues in Data Correction

IEEE Intelligent Systems
Evaluating noise elimination techniques for software quality estimation

Intelligent Data Analysis
An algorithm for correcting mislabeled data

Intelligent Data Analysis
Improving software quality prediction by noise filtering techniques

Journal of Computer Science and Technology
PISA: A framework for multiagent classification using argumentation

Data & Knowledge Engineering
Assessing the quality and cleaning of a software project dataset: an experience report

EASE'06 Proceedings of the 10th international conference on Evaluation and Assessment in Software Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data quality is a prime concern for many tasks in learning and induction. We proposed in a previous paper a noise correction mechanism called polishing, which exploits the interdependence between the different components of a data set, to identify the noisy values and their appropriate replacements. The design of a sound and informative metric for evaluating the effectiveness of a noise correction scheme turned out to be non-trivial. We motivate here a number of classifier dependent measures and proximity measures, each focusing on a different aspect of the corrected data and the associated classifier. We report on some extended experimentation with polishing, as measured by the proposed metrics. The results suggested that polishing is able to repair a corrupted data set to some extent, and the metrics we devised appear to be reasonable.