A philosophical basis for knowledge acquisition
Knowledge Acquisition
C4.5: programs for machine learning
C4.5: programs for machine learning
OHSUMED: an interactive retrieval evaluation and new large test collection for research
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
A re-examination of text categorization methods
Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval
A simple, fast, and effective rule learner
AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Improved Boosting Algorithms Using Confidence-rated Predictions
Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Mining needle in a haystack: classifying rare classes via two-phase rule induction
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Information Retrieval
AdaCost: Misclassification Cost-Sensitive Boosting
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
A Comparative Study of Cost-Sensitive Boosting Algorithms
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Evaluating Boosting Algorithms to Classify Rare Classes: Comparison and Improvements
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Rule-Based prediction of rare extreme values
DS'06 Proceedings of the 9th international conference on Discovery Science
Hi-index | 0.00 |
Learning good classifier models of rare events is a challenging task. On such problems, the recently proposed two-phase rule induction algorithm, PNrule, outperforms other non-meta methods of rule induction. Boosting is a strong meta-classifier approach, and has been shown to be adaptable to skewed class distributions. PNrule's key feature is to identify the relevant false positives and to collectively remove them. In this paper, we qualitatively argue that this ability is not guaranteed by the boosting methodology. We simulate learning scenarios of varying difficulty to demonstrate that this fundamental qualitative difference in the two mechanisms results in existence of many scenarios in which PNrule achieves comparable or significantly better performance than AdaCost, a strong cost-sensitive boosting algorithm. Even a comparable performance by PNrule is desirable because it yields a more easily interpretable model over an ensemble of models generated by boosting. We also show similar supporting results on real-world and benchmark datasets.