Predicting Rare Classes: Comparing Two-Phase Rule Induction to Cost-Sensitive Boosting
PKDD '02 Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery
Predicting rare classes: can boosting make any weak learner strong?
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Editorial: special issue on learning from imbalanced data sets
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Mining with rarity: a unifying framework
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Boosted Classification Trees and Class Probability/Quantile Estimation
The Journal of Machine Learning Research
Cost-sensitive boosting for classification of imbalanced data
Pattern Recognition
Using Real-Valued Meta Classifiers to Integrate and Contextualize Binding Site Predictions
ICANNGA '07 Proceedings of the 8th international conference on Adaptive and Natural Computing Algorithms, Part I
Using Cost-Sensitive Learning to Determine Gene Conversions
ICIC '08 Proceedings of the 4th international conference on Intelligent Computing: Advanced Intelligent Computing Theories and Applications - with Aspects of Artificial Intelligence
Local reweight wrapper for the problem of imbalance
International Journal of Artificial Intelligence and Soft Computing
Acquisition of a classification model for a risk search system from unbalanced textual examples
International Journal of Business Intelligence and Data Mining
Supervised Machine Learning: A Review of Classification Techniques
Proceedings of the 2007 conference on Emerging Artificial Intelligence Applications in Computer Engineering: Real Word AI Systems with Applications in eHealth, HCI, Information Retrieval and Pervasive Technologies
The Needles-in-Haystack Problem
MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
Knowledge discovery from imbalanced and noisy data
Data & Knowledge Engineering
Improving software-quality predictions with data sampling and boosting
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
COG: local decomposition for rare class analysis
Data Mining and Knowledge Discovery
A large-scale active learning system for topical categorization on the web
Proceedings of the 19th international conference on World wide web
CAEPIA'11 Proceedings of the 14th international conference on Advances in artificial intelligence: spanish association for artificial intelligence
Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning
ICIC'05 Proceedings of the 2005 international conference on Advances in Intelligent Computing - Volume Part I
Integrating binding site predictions using non-linear classification methods
Proceedings of the First international conference on Deterministic and Statistical Methods in Machine Learning
Relay boost fusion for learning rare concepts in multimedia
CIVR'06 Proceedings of the 5th international conference on Image and Video Retrieval
Outlier ensembles: position paper
ACM SIGKDD Explorations Newsletter
Hi-index | 0.00 |
Classification of rare vents has many important data mining applications. Boosting is a promising meta-techniquethat improves the classification performance of any weak classifier. So far, no systematic study has been conducted to evaluate how boosting performs for the task of mining rare classes. In this paper, we evaluate three existing categories of boosting algorithms from the single viewpoint of how they update the example weights in eachiteration, and discuss their possible effect on recall andprecision of the rare class. We propose enhanced algorithms in two of the categories, and justify their choice of weightupdating parameters theoretically. Using some specially designed synthetic datasets, we compare the capability of all the algorithms from the rare class perspective. Theresults support our qualitative analysis, and also indicate that our enhancements bring an extra capability for achieving better balance between recall and precision in mining rareclasses.