Tackling the problem of classification with noisy data using Multiple Classifier Systems: Analysis of the performance and robustness

Authors:
José A. SáEz;Mikel Galar;JuliáN Luengo;Francisco Herrera
Affiliations:
-;-;-;-
Venue:
Information Sciences: an International Journal
Year:
2013

Citing 31
Cited 0

Solving of optimization and identification problems by the committee methods

Pattern Recognition
C4.5: programs for machine learning

C4.5: programs for machine learning
Decision Combination in Multiple Classifier Systems

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Method of Combining Multiple Experts for the Recognition of Unconstrained Handwritten Numerals

IEEE Transactions on Pattern Analysis and Machine Intelligence
Support-Vector Networks

Machine Learning
Knowledge acquisition from databases

Knowledge acquisition from databases
On Combining Classifiers

IEEE Transactions on Pattern Analysis and Machine Intelligence
An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization

Machine Learning
Real-world Data is Dirty: Data Cleansing and The Merge/Purge Problem

Data Mining and Knowledge Discovery
A Framework for Analysis of Data Quality Research

IEEE Transactions on Knowledge and Data Engineering
Induction of Decision Trees

Machine Learning
Correcting Noisy Data

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Combining Pattern Classifiers: Methods and Algorithms

Combining Pattern Classifiers: Methods and Algorithms
Analyzing Software Measurement Data with Clustering Techniques

IEEE Intelligent Systems
Polishing Blemishes: Issues in Data Correction

IEEE Intelligent Systems
Class Noise vs. Attribute Noise: A Quantitative Study

Artificial Intelligence Review
Statistical Comparisons of Classifiers over Multiple Data Sets

The Journal of Machine Learning Research
Top 10 algorithms in data mining

Knowledge and Information Systems
KEEL: a software tool to assess evolutionary algorithms for data mining problems

Soft Computing - A Fusion of Foundations, Methodologies and Applications - Special Issue on Evolutionary and Metaheuristics based Data Mining (EMBDM); Guest Editors: José A. Gámez, María J. del Jesús, José M. Puerta
Machine Learning and Data Mining: Introduction to Principles and Algorithms

Machine Learning and Data Mining: Introduction to Principles and Algorithms
Error detection and impact-sensitive instance ranking in noisy datasets

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Improved heterogeneous distance functions

Journal of Artificial Intelligence Research
Integrative Windowing

Journal of Artificial Intelligence Research
A study of the effect of different types of noise on the precision of supervised learning techniques

Artificial Intelligence Review
Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power

Information Sciences: an International Journal
Integrating induction and deduction for noisy data mining

Information Sciences: an International Journal
An empirical evaluation of bagging and boosting

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
A Robust Multiple Classifier System for Pixel Classification of Remote Sensing Images

Fundamenta Informaticae
Mining With Noise Knowledge: Error-Aware Data Mining

IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Comparing Boosting and Bagging Techniques With Noisy and Imbalanced Data

IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans

Quantified Score

Hi-index	0.07

Visualization

Abstract

Traditional classifier learning algorithms build a unique classifier from the training data. Noisy data may deteriorate the performance of this classifier depending on the degree of sensitiveness to data corruptions of the learning method. In the literature, it is widely claimed that building several classifiers from noisy training data and combining their predictions is an interesting method of overcoming the individual problems produced by noise in each classifier. This statement is usually not supported by thorough empirical studies considering problems with different types and levels of noise. Furthermore, in noisy environments, the noise robustness of the methods can be more important than the performance results themselves and, therefore, it must be carefully studied. This paper aims to reach conclusions on such aspects focusing on the analysis of the behavior, in terms of performance and robustness, of several Multiple Classifier Systems against their individual classifiers when these are trained with noisy data. In order to accomplish this study, several classification algorithms, of varying noise robustness, will be chosen and compared with respect to their combination on a large collection of noisy datasets. The results obtained show that the success of the Multiple Classifier Systems trained with noisy data depends on the individual classifiers chosen, the decisions combination method and the type and level of noise present in the dataset, but also on the way of creating diversity to build the final system. In most of the cases, they are able to outperform all their single classification algorithms in terms of global performance, even though their robustness results will depend on the way of introducing diversity into the Multiple Classifier System.