Transforming classifier scores into accurate multiclass probability estimates

  • Authors:
  • Bianca Zadrozny;Charles Elkan

  • Affiliations:
  • University of California, San Diego, La Jolla, California;University of California, San Diego, La Jolla, California

  • Venue:
  • Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Class membership probability estimates are important for many applications of data mining in which classification outputs are combined with other sources of information for decision-making, such as example-dependent misclassification costs, the outputs of other classifiers, or domain knowledge. Previous calibration methods apply only to two-class problems. Here, we show how to obtain accurate probability estimates for multiclass problems by combining calibrated binary probability estimates. We also propose a new method for obtaining calibrated two-class probability estimates that can be applied to any classifier that produces a ranking of examples. Using naive Bayes and support vector machine classifiers, we give experimental results from a variety of two-class and multiclass domains, including direct marketing, text categorization and digit recognition.