Maximum Likelihood in Cost-Sensitive Learning: Model Specification, Approximations, and Upper Bounds

Authors:
Jacek P. Dmochowski;Paul Sajda;Lucas C. Parra
Affiliations:
-;-;-
Venue:
The Journal of Machine Learning Research
Year:
2010

Citing 12
Cited 1

Fundamentals of statistical signal processing: estimation theory

Fundamentals of statistical signal processing: estimation theory
Bagging predictors

Machine Learning
MetaCost: a general method for making classifiers cost-sensitive

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Explicitly representing expected cost: an alternative to ROC representation

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Learning and making decisions when costs and probabilities are both unknown

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Support Vector Machines for Classification in Nonstandard Situations

Machine Learning
Cost-Sensitive Learning by Cost-Proportionate Example Weighting

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Editorial: special issue on learning from imbalanced data sets

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
The foundations of cost-sensitive learning

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
An overview of statistical learning theory

IEEE Transactions on Neural Networks
Local estimation of posterior class probabilities to minimize classification errors

IEEE Transactions on Neural Networks

Conditional likelihood maximisation: a unifying framework for information theoretic feature selection

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

The presence of asymmetry in the misclassification costs or class prevalences is a common occurrence in the pattern classification domain. While much interest has been devoted to the study of cost-sensitive learning techniques, the relationship between cost-sensitive learning and the specification of the model set in a parametric estimation framework remains somewhat unclear. To that end, we differentiate between the case of the model including the true posterior, and that in which the model is misspecified. In the former case, it is shown that thresholding the maximum likelihood (ML) estimate is an asymptotically optimal solution to the risk minimization problem. On the other hand, under model misspecification, it is demonstrated that thresholded ML is suboptimal and that the risk-minimizing solution varies with the misclassification cost ratio. Moreover, we analytically show that the negative weighted log likelihood (Elkan, 2001) is a tight, convex upper bound of the empirical loss. Coupled with empirical results on several real-world data sets, we argue that weighted ML is the preferred cost-sensitive technique.