Overfitting cautious selection of classifier ensembles with genetic algorithms

Authors:
Eulanda M. Dos Santos;Robert Sabourin;Patrick Maupin
Affiliations:
Ecole de Technologie Superieure - ETS, Genie de la production automatisee, 1100, Rue Notre-Dame Ouest, Montreal, Quebec, Canada H3C1K3;Ecole de Technologie Superieure - ETS, Genie de la production automatisee, 1100, Rue Notre-Dame Ouest, Montreal, Quebec, Canada H3C1K3;Ecole de Technologie Superieure - ETS, Genie de la production automatisee, 1100, Rue Notre-Dame Ouest, Montreal, Quebec, Canada H3C1K3
Venue:
Information Fusion
Year:
2009

Citing 23
Cited 9

C4.5: programs for machine learning

C4.5: programs for machine learning
Bagging predictors

Machine Learning
On Combining Classifiers

IEEE Transactions on Pattern Analysis and Machine Intelligence
The Random Subspace Method for Constructing Decision Forests

IEEE Transactions on Pattern Analysis and Machine Intelligence
An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization

Machine Learning
Machine Learning

Machine Learning
Multi-Objective Optimization Using Evolutionary Algorithms

Multi-Objective Optimization Using Evolutionary Algorithms
Automatic Recognition of Handwritten Numerical Strings: A Recognition and Verification Strategy

IEEE Transactions on Pattern Analysis and Machine Intelligence
Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy

Machine Learning
Backwarding: An Overfitting Control for Genetic Programming in a Remote Sensing Application

Selected Papers from the 5th European Conference on Artificial Evolution
Feature Selection for Support Vector Machines by Means of Genetic Algorithms

ICTAI '03 Proceedings of the 15th IEEE International Conference on Tools with Artificial Intelligence
Optimizing Nearest Neighbour in Random Subspaces using a Multi-Objective Genetic Algorithm

ICPR '04 Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 1 - Volume 01
Does overfitting affect performance in estimation of distribution algorithms

Proceedings of the 8th annual conference on Genetic and evolutionary computation
Using diversity of errors for selecting members of a committee classifier

Pattern Recognition
Engineering multiversion neural-net systems

Neural Computation
Pareto analysis for the selection of classifier ensembles

Proceedings of the 10th annual conference on Genetic and evolutionary computation
Overfitting in the selection of classifier ensembles: a comparative study between PSO and GA

Proceedings of the 10th annual conference on Genetic and evolutionary computation
Sequential genetic search for ensemble feature selection

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Evaluation of diversity measures for binary classifier ensembles

MCS'05 Proceedings of the 6th international conference on Multiple Classifier Systems
Over-Fitting in ensembles of neural network classifiers within ECOC frameworks

MCS'05 Proceedings of the 6th international conference on Multiple Classifier Systems
Performance assessment of multiobjective optimizers: an analysis and review

IEEE Transactions on Evolutionary Computation
Multiobjective GAs, quantitative indices, and pattern classification

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Pruning algorithms-a survey

IEEE Transactions on Neural Networks

Boosting CBR Agents with Genetic Algorithms

ICCBR '09 Proceedings of the 8th International Conference on Case-Based Reasoning: Case-Based Reasoning Research and Development
An empirical study of binary classifier fusion methods for multiclass classification

Information Fusion
Dynamic classifier ensemble model for customer classification with imbalanced class distribution

Expert Systems with Applications: An International Journal
Remote sensing image classification based on neural network ensemble algorithm

Neurocomputing
A Two Stage Sequential Ensemble Applied to the Classification of Alzheimer's Disease Based on MRI Features

Neural Processing Letters
Multi-objective evolutionary optimization for generating ensembles of classifiers in the ROC space

Proceedings of the 14th annual conference on Genetic and evolutionary computation
Fusion of feature sets and classifiers for facial expression recognition

Expert Systems with Applications: An International Journal
Evolutionary computation for supervised learning

Proceedings of the 15th annual conference companion on Genetic and evolutionary computation
A Lattice-Computing ensemble for reasoning based on formal fusion of disparate data types, and an industrial dispensing application

Information Fusion

Quantified Score

Hi-index	0.00

Visualization

Abstract

Information fusion research has recently focused on the characteristics of the decision profiles of ensemble members in order to optimize performance. These characteristics are particularly important in the selection of ensemble members. However, even though the control of overfitting is a challenge in machine learning problems, much less work has been devoted to the control of overfitting in selection tasks. The objectives of this paper are: (1) to show that overfitting can be detected at the selection stage; and (2) to present strategies to control overfitting. Decision trees and k nearest neighbors classifiers are used to create homogeneous ensembles, while single- and multi-objective genetic algorithms are employed as search algorithms at the selection stage. In this study, we use bagging and random subspace methods for ensemble generation. The classification error rate and a set of diversity measures are applied as search criteria. We show experimentally that the selection of classifier ensembles conducted by genetic algorithms is prone to overfitting, especially in the multi-objective case. In this study, the partial validation, backwarding and global validation strategies are tailored for classifier ensemble selection problem and compared. This comparison allows us to show that a global validation strategy should be applied to control overfitting in pattern recognition systems involving an ensemble member selection task. Furthermore, this study has helped us to establish that the global validation strategy can be used to measure the relationship between diversity and classification performance when diversity measures are employed as single-objective functions.