Predicting rare classes: can boosting make any weak learner strong?

Authors:
Mahesh V. Joshi;Ramesh C. Agarwal;Vipin Kumar
Affiliations:
University of Minnesota, Minneapolis;IBM Almaden Research Center, San Jose, CA;University of Minnesota, Minneapolis, MN
Venue:
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2002

Citing 9
Cited 12

OHSUMED: an interactive retrieval evaluation and new large test collection for research

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
An introduction to computational learning theory

An introduction to computational learning theory
A simple, fast, and effective rule learner

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Improved Boosting Algorithms Using Confidence-rated Predictions

Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Mining needle in a haystack: classifying rare classes via two-phase rule induction

SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Information Retrieval

Information Retrieval
AdaCost: Misclassification Cost-Sensitive Boosting

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
A Comparative Study of Cost-Sensitive Boosting Algorithms

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Evaluating Boosting Algorithms to Classify Rare Classes: Comparison and Improvements

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining

Mining with rarity: a unifying framework

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Feature bagging for outlier detection

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Local decomposition for rare class analysis

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Anomaly detection: A survey

ACM Computing Surveys (CSUR)
Sequential anomaly detection based on temporal-difference learning: Principles, models and case studies

Applied Soft Computing
Multimedia search with pseudo-relevance feedback

CIVR'03 Proceedings of the 2nd international conference on Image and video retrieval
A large-scale active learning system for topical categorization on the web

Proceedings of the 19th international conference on World wide web
Anomaly intrusion detection for evolving data stream based on semi-supervised learning

ICONIP'08 Proceedings of the 15th international conference on Advances in neuro-information processing - Volume Part I
An imbalanced data rule learner

PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
A fuzzy anomaly detection system

WISI'06 Proceedings of the 2006 international conference on Intelligence and Security Informatics
Multi-level relationship outlier detection

International Journal of Business Intelligence and Data Mining
Feature selection for high-dimensional imbalanced data

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Boosting is a strong ensemble-based learning algorithm with the promise of iteratively improving the classification accuracy using any base learner, as long as it satisfies the condition of yielding weighted accuracy 0.5. In this paper, we analyze boosting with respect to this basic condition on the base learner, to see if boosting ensures prediction of rarely occurring events with high recall and precision. First we show that a base learner can satisfy the required condition even for poor recall or precision levels, especially for very rare classes. Furthermore, we show that the intelligent weight updating mechanism in boosting, even in its strong cost-sensitive form, does not prevent cases where the base learner always achieves high precision but poor recall or high recall but poor precision, when mapped to the original distribution. In either of these cases, we show that the voting mechanism of boosting falls to achieve good overall recall and precision for the ensemble. In effect, our analysis indicates that one cannot be blind to the base learner performance, and just rely on the boosting mechanism to take care of its weakness. We validate our arguments empirically on variety of real and synthetic rare class problems. In particular, using AdaCost as the boosting algorithm, and variations of PNrule and RIPPER as the base learners, we show that if algorithm A achieves better recall-precision balance than algorithm B, then using A as the base learner in AdaCost yields significantly better performance than using B as the base learner.