A simple, fast, and effective rule learner
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Boosting to correct inductive bias in text classification
Proceedings of the eleventh international conference on Information and knowledge management
Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
AdaCost: Misclassification Cost-Sensitive Boosting
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
A Comparative Study of Cost-Sensitive Boosting Algorithms
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Evaluating Boosting Algorithms to Classify Rare Classes: Comparison and Improvements
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Logistic Regression, AdaBoost and Bregman Distances
COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
On the rate of convergence of regularized boosting classifiers
The Journal of Machine Learning Research
Boosting as a Regularized Path to a Maximum Margin Classifier
The Journal of Machine Learning Research
Some Theory for Generalized Boosting Algorithms
The Journal of Machine Learning Research
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
The foundations of cost-sensitive learning
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Evidence Contrary to the Statistical View of Boosting
The Journal of Machine Learning Research
Improving software-quality predictions with data sampling and boosting
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Bi-Level Path Following for Cross Validated Solution of Kernel Quantile Regression
The Journal of Machine Learning Research
An asymmetric classifier based on partial least squares
Pattern Recognition
RAMOBoost: ranked minority oversampling in boosting
IEEE Transactions on Neural Networks
QBoost: Predicting quantiles with boosting for regression and binary classification
Expert Systems with Applications: An International Journal
On Equivalence Relationships Between Classification and Ranking Algorithms
The Journal of Machine Learning Research
Shedding light on the asymmetric learning capability of AdaBoost
Pattern Recognition Letters
A novel synthetic minority oversampling technique for imbalanced data set learning
ICONIP'11 Proceedings of the 18th international conference on Neural Information Processing - Volume Part II
Proceedings of the 5th International Conference on PErvasive Technologies Related to Assistive Environments
A survey of cost-sensitive decision tree induction algorithms
ACM Computing Surveys (CSUR)
Training and assessing classification rules with imbalanced data
Data Mining and Knowledge Discovery
Multimedia Tools and Applications
Hi-index | 0.00 |
The standard by which binary classifiers are usually judged, misclassification error, assumes equal costs of misclassifying the two classes or, equivalently, classifying at the 1/2 quantile of the conditional class probability function P[y=1|x]. Boosted classification trees are known to perform quite well for such problems. In this article we consider the use of standard, off-the-shelf boosting for two more general problems: 1) classification with unequal costs or, equivalently, classification at quantiles other than 1/2, and 2) estimation of the conditional class probability function P[y=1|x]. We first examine whether the latter problem, estimation of P[y=1|x], can be solved with LogitBoost, and with AdaBoost when combined with a natural link function. The answer is negative: both approaches are often ineffective because they overfit P[y=1|x] even though they perform well as classifiers. A major negative point of the present article is the disconnect between class probability estimation and classification. Next we consider the practice of over/under-sampling of the two classes. We present an algorithm that uses AdaBoost in conjunction with Over/Under-Sampling and Jittering of the data "JOUS-Boost". This algorithm is simple, yet successful, and it preserves the advantage of relative protection against overfitting, but for arbitrary misclassification costs and, equivalently, arbitrary quantile boundaries. We then use collections of classifiers obtained from a grid of quantiles to form estimators of class probabilities. The estimates of the class probabilities compare favorably to those obtained by a variety of methods across both simulated and real data sets.