Is "best-so-far" a good algorithmic performance metric?

  • Authors:
  • Nathaniel P. Troutman;Brent E. Eskridge;Dean F. Hougen

  • Affiliations:
  • Southern Nazarene University, Bethany, OK, USA;Southern Nazarene University, Bethany, OK, USA;University of Oklahoma, Norman, OK, USA

  • Venue:
  • Proceedings of the 10th annual conference on Genetic and evolutionary computation
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In evolutionary computation, experimental results are commonly analyzed using an algorithmic performance metric called best-so-far. While best-so-far can be a useful metric, its use is particularly susceptible to three pitfalls: a failure to establish a baseline for comparison, a failure to perform significance testing, and an insufficient sample size. The nature of best-so-far means that it is highly susceptible to these pitfalls. If these pitfalls are not avoided, the use of the best-so-far metric can lead to confusion at best and misleading results at worst. We detail how the use of multiple experimental runs, random search as a baseline, and significance testing can help researchers avoid these common pitfalls. Furthermore, we demonstrate how best-so-far can be an effective algorithmic performance metric if these guidelines are followed.