Class Probability Estimation and Cost-Sensitive Classification Decisions

  • Authors:
  • Dragos D. Margineantu

  • Affiliations:
  • -

  • Venue:
  • ECML '02 Proceedings of the 13th European Conference on Machine Learning
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

For a variety of applications, machine learning algorithms are required to construct models that minimize the total loss associated with the decisions, rather than the number of errors. One of the most efficient approaches to building models that are sensitive to non-uniform costs of errors is to first estimate the class probabilities of the unseen instances and then to make the decision based on both the computed probabilities and the loss function. Although all classification algorithms can be converted into algorithms for learning models that compute class probabilities, in many cases the computed estimates have proven to be inaccurate. As a result, there is a large research effort to improve the accuracy of the estimates computed by different algorithms. This paper presents a novel approach to cost-sensitive learning that addresses the problem of minimizing the actual cost of the decisions rather than improving the overall quality of the probability estimates. The decision making step for our methods is based on the distribution of the individual scores computed by classifiers that are built by different types of ensembles of decision trees. The new approach relies on statistics that measure the probability that the computed estimates are on one side or the other of the decision boundary, rather than trying to improve the quality of the estimates. The experimental analysis of the new algorithms that were developed based on our approach gives new insight into cost-sensitive decision making and shows that for some tasks, the new algorithms outperform some of the best probability-based algorithms for cost-sensitive learning.