Class Probability Estimation and Cost-Sensitive Classification Decisions

Authors:
Dragos D. Margineantu
Affiliations:
-
Venue:
ECML '02 Proceedings of the 13th European Conference on Machine Learning
Year:
2002

Citing 15
Cited 17

C4.5: programs for machine learning

C4.5: programs for machine learning
Shape quantization and recognition with randomized trees

Neural Computation
MetaCost: a general method for making classifiers cost-sensitive

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization

Machine Learning
Random Forests

Machine Learning
Pruning Decision Trees with Misclassification Costs

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
The Case against Accuracy Estimation for Comparing Induction Algorithms

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
AdaCost: Misclassification Cost-Sensitive Boosting

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Bootstrap Methods for the Cost-Sensitive Evaluation of Classifiers

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Exploiting the Cost (In)sensitivity of Decision Tree Splitting Criteria

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Active learning for class probability estimation and ranking

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
The foundations of cost-sensitive learning

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
The use of the area under the ROC curve in the evaluation of machine learning algorithms

Pattern Recognition

Cost-Sensitive Learning by Cost-Proportionate Example Weighting

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
One-Benefit learning: cost-sensitive learning with restricted cost information

UBDM '05 Proceedings of the 1st international workshop on Utility-based data mining
Handling Generalized Cost Functions in the Partitioning Optimization Problem through Sequential Binary Programming

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Cost curves: An improved method for visualizing classifier performance

Machine Learning
Extending boosting for large scale spoken language understanding

Machine Learning
Extending boosting for large scale spoken language understanding

Machine Learning
Instance weighting versus threshold adjusting for cost-sensitive classification

Knowledge and Information Systems
Improve Flow Accuracy and Byte Accuracy in Network Traffic Classification

ICIC '08 Proceedings of the 4th international conference on Intelligent Computing: Advanced Intelligent Computing Theories and Applications - with Aspects of Artificial Intelligence
Cost-Based Sampling of Individual Instances

Canadian AI '09 Proceedings of the 22nd Canadian Conference on Artificial Intelligence: Advances in Artificial Intelligence
Active cost-sensitive learning

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Linguistic cost-sensitive learning of genetic fuzzy classifiers for imprecise data

International Journal of Approximate Reasoning
An extended tuning method for cost-sensitive regression and forecasting

Decision Support Systems
Classifying severely imbalanced data

Canadian AI'11 Proceedings of the 24th Canadian conference on Advances in artificial intelligence
Cost-sensitive classification with inadequate labeled data

Information Systems
Predicting the need for CT imaging in children with minor head injury using an ensemble of Naive Bayes classifiers

Artificial Intelligence in Medicine
A cost-sensitive technique for positive-example learning supporting content-based product recommendations in B-to-C e-commerce

Decision Support Systems
DCPE co-training for classification

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

For a variety of applications, machine learning algorithms are required to construct models that minimize the total loss associated with the decisions, rather than the number of errors. One of the most efficient approaches to building models that are sensitive to non-uniform costs of errors is to first estimate the class probabilities of the unseen instances and then to make the decision based on both the computed probabilities and the loss function. Although all classification algorithms can be converted into algorithms for learning models that compute class probabilities, in many cases the computed estimates have proven to be inaccurate. As a result, there is a large research effort to improve the accuracy of the estimates computed by different algorithms. This paper presents a novel approach to cost-sensitive learning that addresses the problem of minimizing the actual cost of the decisions rather than improving the overall quality of the probability estimates. The decision making step for our methods is based on the distribution of the individual scores computed by classifiers that are built by different types of ensembles of decision trees. The new approach relies on statistics that measure the probability that the computed estimates are on one side or the other of the decision boundary, rather than trying to improve the quality of the estimates. The experimental analysis of the new algorithms that were developed based on our approach gives new insight into cost-sensitive decision making and shows that for some tasks, the new algorithms outperform some of the best probability-based algorithms for cost-sensitive learning.