Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Transforming classifier scores into accurate multiclass probability estimates
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
An empirical comparison of supervised learning algorithms
ICML '06 Proceedings of the 23rd international conference on Machine learning
Hierarchical classification: combining Bayes with SVM
ICML '06 Proceedings of the 23rd international conference on Machine learning
Mining citizen science data to predict orevalence of wild bird species
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Estimating class priors in domain adaptation for word sense disambiguation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Artificial Intelligence in Medicine
Minimax Regret Classifier for Imprecise Class Distributions
The Journal of Machine Learning Research
Machine Learning
Temporal feature induction for baseball highlight classification
Proceedings of the 15th international conference on Multimedia
Journal of Biomedical Informatics
An empirical evaluation of supervised learning in high dimensions
Proceedings of the 25th international conference on Machine learning
Cost-sensitive multi-class classification from probability estimates
Proceedings of the 25th international conference on Machine learning
PRIE: a system for generating rulelists to maximize ROC performance
Data Mining and Knowledge Discovery
ECML '07 Proceedings of the 18th European conference on Machine Learning
Classifier Loss Under Metric Uncertainty
ECML '07 Proceedings of the 18th European conference on Machine Learning
PKDD 2007 Proceedings of the 11th European conference on Principles and Practice of Knowledge Discovery in Databases
Learning Distance Functions for Automatic Annotation of Images
Adaptive Multimedial Retrieval: Retrieval, User, and Semantics
Naive Bayes for optimal ranking
Journal of Experimental & Theoretical Artificial Intelligence
Foundations and Trends in Databases
Consistent phrase relevance measures
Proceedings of the 2nd International Workshop on Data Mining and Audience Intelligence for Advertising
The ROC isometrics approach to construct reliable classifiers
Intelligent Data Analysis
Calibrating Probability Density Forecasts with Multi-objective Search
Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
Design challenges and misconceptions in named entity recognition
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Exploiting contexts to deal with uncertainty in classification
Proceedings of the 1st ACM SIGKDD Workshop on Knowledge Discovery from Uncertain Data
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Estimation of class membership probabilities in the document classification
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Combining clauses with various precisions and recalls to produce accurate probabilistic estimates
ILP'07 Proceedings of the 17th international conference on Inductive logic programming
A decision support system for cost-effective diagnosis
Artificial Intelligence in Medicine
Service-oriented information extraction
Proceedings of the 2011 Joint EDBT/ICDT Ph.D. Workshop
Calibrated lazy associative classification
Information Sciences: an International Journal
Categorization of display ads using image and landing page features
Proceedings of the Third Workshop on Large Scale Data Mining: Theory and Applications
An iterative semi-supervised approach to software fault prediction
Proceedings of the 7th International Conference on Predictive Models in Software Engineering
A unifying view on dataset shift in classification
Pattern Recognition
Robust probabilistic calibration
ECML'06 Proceedings of the 17th European conference on Machine Learning
Attribute and object selection queries on objects with probabilistic attributes
ACM Transactions on Database Systems (TODS)
Estimating the risk of fire outbreaks in the natural environment
Data Mining and Knowledge Discovery
Feature weighted minimum distance classifier with multi-class confidence estimation
AI'06 Proceedings of the 19th Australian joint conference on Artificial Intelligence: advances in Artificial Intelligence
Editors Choice Article: I2VM: Incremental import vector machines
Image and Vision Computing
Design principles of massive, robust prediction systems
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Probability estimation for multi-class classification based on label ranking
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Outlier detection for patient monitoring and alerting
Journal of Biomedical Informatics
Partial Least Square Discriminant Analysis for bankruptcy prediction
Decision Support Systems
Ad click prediction: a view from the trenches
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
The Journal of Machine Learning Research
Accurate probability calibration for multiple classifiers
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Multimedia event detection with multimodal feature fusion and temporal concept localization
Machine Vision and Applications
Hi-index | 0.00 |
We examine the relationship between the predictions made by different learning algorithms and true posterior probabilities. We show that maximum margin methods such as boosted trees and boosted stumps push probability mass away from 0 and 1 yielding a characteristic sigmoid shaped distortion in the predicted probabilities. Models such as Naive Bayes, which make unrealistic independence assumptions, push probabilities toward 0 and 1. Other models such as neural nets and bagged trees do not have these biases and predict well calibrated probabilities. We experiment with two ways of correcting the biased probabilities predicted by some learning methods: Platt Scaling and Isotonic Regression. We qualitatively examine what kinds of distortions these calibration methods are suitable for and quantitatively examine how much data they need to be effective. The empirical results show that after calibration boosted trees, random forests, and SVMs predict the best probabilities.