Using genetic programming to obtain implicit diversity

  • Authors:
  • Ulf Johansson;Cecilia Sönströd;Tuve Löfström;Rikard König

  • Affiliations:
  • School of Business and Informatics, University of Borås, Borås, Sweden;School of Business and Informatics, University of Borås, Borås, Sweden;School of Business and Informatics, University of Borås, Borås, Sweden;School of Business and Informatics, University of Borås, Borås, Sweden

  • Venue:
  • CEC'09 Proceedings of the Eleventh conference on Congress on Evolutionary Computation
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

When performing predictive data mining, the use of ensembles is known to increase prediction accuracy, compared to single models. To obtain this higher accuracy, ensembles should be built from base classifiers that are both accurate and diverse. The question of how to balance these two properties in order to maximize ensemble accuracy is, however, far from solved and many different techniques for obtaining ensemble diversity exist. One such technique is bagging, where implicit diversity is introduced by training base classifiers on different subsets of available data instances, thus resulting in less accurate, but diverse base classifiers. In this paper, genetic programming is used as an alternative method to obtain implicit diversity in ensembles by evolving accurate, but different base classifiers in the form of decision trees, thus exploiting the inherent inconsistency of genetic programming. The experiments show that the GP approach outperforms standard bagging of decision trees, obtaining significantly higher ensemble accuracy over 25 UCI datasets. This superior performance stems from base classifiers having both higher average accuracy and more diversity. Implicitly introducing diversity using GP thus works very well, since evolved base classifiers tend to be highly accurate and diverse.