On Comparing Classifiers: Pitfalls toAvoid and a Recommended Approach

  • Authors:
  • Steven L. Salzberg

  • Affiliations:
  • Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA. E-mail: salzberg@cs.jhu.edu

  • Venue:
  • Data Mining and Knowledge Discovery
  • Year:
  • 1997

Quantified Score

Hi-index 0.02

Visualization

Abstract

An important component of many data mining projects is finding agood classification algorithm, a process that requires very careful thoughtabout experimental design. If not done very carefully, comparative studiesof classification and other types of algorithms can easily result instatistically invalid conclusions. This is especially true when one is usingdata mining techniques to analyze very large databases, which inevitablycontain some statistically unlikely data. This paper describes severalphenomena that can, if ignored, invalidate an experimental comparison. Thesephenomena and the conclusions that follow apply not only to classification,but to computational experiments in almost any aspect of data mining. Thepaper also discusses why comparative analysis is more important inevaluating some types of algorithms than for others, and provides somesuggestions about how to avoid the pitfalls suffered by many experimentalstudies.