C4.5: programs for machine learning
C4.5: programs for machine learning
Lazy learning
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss
Machine Learning - Special issue on learning with probabilistic representations
Machine Learning for the Detection of Oil Spills in Satellite Radar Images
Machine Learning - Special issue on applications of machine learning and the knowledge discovery process
Boosting and Rocchio applied to text filtering
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
MetaCost: a general method for making classifiers cost-sensitive
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Robust Classification for Imprecise Environments
Machine Learning
Machine Learning
On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality
Data Mining and Knowledge Discovery
Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey
Data Mining and Knowledge Discovery
Applying One-Sided Selection to Unbalanced Datasets
MICAI '00 Proceedings of the Mexican International Conference on Artificial Intelligence: Advances in Artificial Intelligence
A study of the behavior of several methods for balancing machine learning training data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Minority report in fraud detection: classification of skewed data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
A multistrategy approach for digital text categorization from imbalanced documents
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Multi-Classifier Systems: Review and a roadmap for developers
International Journal of Hybrid Intelligent Systems
The class imbalance problem: A systematic study
Intelligent Data Analysis
Constructing ensembles of symbolic classifiers
International Journal of Hybrid Intelligent Systems - Hybrid Intelligent systems in Ensembles
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
Learning when training data are costly: the effect of class distribution on tree induction
Journal of Artificial Intelligence Research
The foundations of cost-sensitive learning
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Hi-index | 0.00 |
Many real-world problems exhibit skewed class distributions in which almost all cases are allotted to a class and far fewer cases to a smaller, usually more interesting class. A learner induced from an imbalanced data set has, typically, a low error rate for the majority class and an undesirable error rate for the minority class. This paper firstly provides a organized study on the various methodologies that have tried to handle this problem. Finally, it presents an experimental study of these methodologies with a proposed selective costing ensemble and it concludes that such a framework can be a more effective solution to the problem. Our method seems to allow improved identification of difficult small class in predictive analysis, while keeping the classification ability of the majority class in an acceptable level.