Pareto analysis for the selection of classifier ensembles

  • Authors:
  • Eulanda M. Dos Santos;Robert Sabourin;Patrick Maupin

  • Affiliations:
  • Ecole de technologie superieure-University of Quebec, Montreal, Canada;Ecole de technologie superieure-University of Quebec, Montreal, Canada;Defence Research and Development Canada, Val-Belair, Canada

  • Venue:
  • Proceedings of the 10th annual conference on Genetic and evolutionary computation
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The overproduce-and-choose strategy involves the generation of an initial large pool of candidate classifiers and it is intended to test different candidate ensembles in order to select the best performing solution. The ensemble's error rate, ensemble size and diversity measures are the most frequent search criteria employed to guide this selection. By applying the error rate, we may accomplish the main objective in Pattern Recognition and Machine Learning, which is to find high-performance predictors. In terms of ensemble size, the hope is to increase the recognition rate while minimizing the number of classifiers in order to meet both the performance and low ensemble size requirements. Finally, ensembles can be more accurate than individual classifiers only when classifier members present diversity among themselves. In this paper we apply two Pareto front spread quality measures to analyze the relationship between the three main search criteria used in the overproduce-and-choose strategy. Experimental results conducted demonstrate that the combination of ensemble size and diversity does not produce conflicting multi-objective optimization problems. Moreover, we cannot decrease the generalization error rate by combining this pair of search criteria. However, when the error rate is combined with diversity or the ensemble size, we found that these measures are conflicting objective functions and that the performances of the solutions are much higher.