Evolving data sets to highlight the performance differences between machine learning classifiers

  • Authors:
  • Thomas Raway;David J. Schaffer;Kenneth J. Kurtz;Hiroki Sayama

  • Affiliations:
  • Binghamton University, Binghamton, NY, USA;Binghamton University, Binghamton, NY, USA;Binghamton University, Binghamton, NY, USA;Binghamton University, Binghamton, NY, USA

  • Venue:
  • Proceedings of the 14th annual conference companion on Genetic and evolutionary computation
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a preliminary study to evolve data sets that maximize performance differences between multiple machine learning classifiers. The aim is to provide useful information towards the decision of which machine learning classifier to use given a particular data set. While literature already exists on comparing multiple classifiers across multiple pre-existing data sets, our approach is novel and unique in that we evolved completely new data sets designed to highlight the performance differences between supervised learning classifiers. By investigating these evolved data sets, we hope to add to the knowledge base concerning which classifiers are appropriate for specific real world classification tasks.