Aggregating performance metrics for classifier evaluation

  • Authors:
  • Naeem Seliya;Taghi M. Khoshgoftaar;Jason Van Hulse

  • Affiliations:
  • Computer and Information Science, University of Michigan - Dearborn, Dearborn, MI;Computer Science and Engineering, Florida Atlantic University, Boca Raton, FL;Computer Science and Engineering, Florida Atlantic University, Boca Raton, FL

  • Venue:
  • IRI'09 Proceedings of the 10th IEEE international conference on Information Reuse & Integration
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

There are several performance metrics that have been proposed for evaluating a classification model, e.g., accuracy, error rates, precision, recall, etc. While it is known that evaluating a classifier on only one performance metric is not advisable, the use of multiple performance metrics poses unique comparative challenges for the analyst. Since different performance metrics provide different perspectives into the classifier performance space, it is common for a learner to be relatively better on one performance metric and not better on another performance metric. We present a novel approach to aggregating several individual performance metrics into one metric, called the Relative Performance Metric (RPM). A large case study consisting of 35 real-world classification datasets, 12 classification algorithms, and 10 commonly used performance metrics illustrates the practical appeal of RPM. The empirical results clearly demonstrate the benefits of using RPM when classifier evaluation requires the consideration of a large number of individual performance metrics.