Pareto optimal linear classification
ICML '06 Proceedings of the 23rd international conference on Machine learning
IEEE Transactions on Knowledge and Data Engineering
Determining noisy instances relative to attributes of interest
Intelligent Data Analysis
Class noise detection using frequent itemsets
Intelligent Data Analysis
An empirical study of the noise impact on cost-sensitive learning
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Knowledge discovery from imbalanced and noisy data
Data & Knowledge Engineering
Empirical case studies in attribute noise detection
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews - Special issue on information reuse and integration
Cost-sensitive case-based reasoning using a genetic algorithm: Application to medical diagnosis
Artificial Intelligence in Medicine
An exploration of learning when data is noisy and imbalanced
Intelligent Data Analysis
Measuring stability of feature ranking techniques: a noise-based approach
International Journal of Business Intelligence and Data Mining
Hi-index | 0.00 |
Recent research in machine learning, data mining and related areas has produced a wide variety of algorithms for cost-sensitive (CS) classification, where instead of maximizing the classification accuracy, minimizing the misclassification cost becomes the objective. However, these methods assume that training sets do not contain significant noise, which is rarely the case in real-world environments. In this paper, we systematically study the impacts of class noise on CS learning, and propose a cost-guided class noise handling algorithm to identify noise for effective CS learning. We call it Cost-guided Iterative Classification Filter (CICF), because it seamlessly integrates costs and an existing Classification Filter for noise identification. Instead of putting equal weights to handle noise in all classes in existing efforts, CICF puts more emphasis on expensive classes, which makes it especially successful in dealing with datasets with a large cost-ratio. Experimental results and comparative studies from real-world datasets indicate that the existence of noise may seriously corrupt the performance of CS classifiers, and by adopting the proposed CICF algorithm, we can significantly reduce the misclassification cost of a CS classifier in noisy environments.