Unbiased assessment of learning algorithms

  • Authors:
  • Tobias Scheffer;Ralf Herbrich

  • Affiliations:
  • Technische Universitat Berlin, Artificial Intelligence Group, Berlin, Germany;Technische Universitat Berlin, Artificial Intelligence Group, Berlin, Germany

  • Venue:
  • IJCAI'97 Proceedings of the Fifteenth international joint conference on Artifical intelligence - Volume 2
  • Year:
  • 1997

Quantified Score

Hi-index 0.01

Visualization

Abstract

In order to rank the performance of machine learning algorithms, many researchers conduct experiments on benchmark data sets. Since most learning algorithms have domain-specific parameters, it is a popular custom to adapt these parameters to obtain a minimal error rate on the test set. The same rate is then used to rank the algorithm, which causes an optimistic bias. We quantify this bias, showing, in particular, that an algorithm with more parameters will probably be ranked higher than an equally good algorithm with fewer parameters. We demonstrate this result, showing the number of parameters and trials required in order to pretend to outperform C4.5 or FOIL, respectively, for various benchmark problems. We then describe out how unbiased ranking experiments should be conducted.