Robust regression and outlier detection
Robust regression and outlier detection
C4.5: programs for machine learning
C4.5: programs for machine learning
Machine Learning
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Experiments with Noise Filtering in a Medical Domain
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Applying Noise Handling Techniques to Genomic Data: A Case Study
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
PRICAI'00 Proceedings of the 6th Pacific Rim international conference on Artificial intelligence
Detecting Aggregate Incongruities in XML
DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Missing or absent? A Question in Cost-sensitive Decision Tree
Proceedings of the 2006 conference on Advances in Intelligent IT: Active Media Technology 2006
Correlation-based detection of attribute outliers
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Proceedings of the 16th International Database Engineering & Applications Sysmposium
Information Sciences: an International Journal
Hi-index | 0.00 |
Data quality is crucial to any data-analysis task, yet blemishes in data can arise from many sources. We thus must understand data imperfections and the effectiveness of various imperfection-handling techniques. The author compares three approaches: robust algorithms that tolerate some corruption; filtering, which eliminates the noisy instances from the input; and polishing, which corrects rather than removes noisy instances. The author argues that polishing has theoretical advantages over the first two approaches and can achieve better results. He also discusses how to evaluate and validate data-correction methods, identifying pitfalls in and suggestions for designing effective metrics for accurately reflecting the extent of correction.