C4.5: programs for machine learning
C4.5: programs for machine learning
Artificial Intelligence Review - Special issue on lazy learning
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss
Machine Learning - Special issue on learning with probabilistic representations
Machine Learning for the Detection of Oil Spills in Satellite Radar Images
Machine Learning - Special issue on applications of machine learning and the knowledge discovery process
Boosting and Rocchio applied to text filtering
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
MetaCost: a general method for making classifiers cost-sensitive
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Robust Classification for Imprecise Environments
Machine Learning
Introduction to Bayesian Networks
Introduction to Bayesian Networks
On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality
Data Mining and Knowledge Discovery
Data Mining and Knowledge Discovery
Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey
Data Mining and Knowledge Discovery
Advances in Instance Selection for Instance-Based Learning Algorithms
Data Mining and Knowledge Discovery
Distributed Data Mining in Credit Card Fraud Detection
IEEE Intelligent Systems
AdaCost: Misclassification Cost-Sensitive Boosting
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Evaluating Boosting Algorithms to Classify Rare Classes: Comparison and Improvements
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Applying One-Sided Selection to Unbalanced Datasets
MICAI '00 Proceedings of the Mexican International Conference on Artificial Intelligence: Advances in Artificial Intelligence
Exploratory Under-Sampling for Class-Imbalance Learning
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
The class imbalance problem: A systematic study
Intelligent Data Analysis
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
Learning classifiers from imbalanced data based on biased minimax probability machine
CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Hi-index | 0.00 |
Many real-world data sets exhibit skewed class distributions in which the classes are not approximately equally represented. Traditional machine learning algorithms can be biased towards majority class due to over-prevalence. This paper firstly provides a systematic study of the various methodologies that have tried to handle this problem. Finally, it presents an experimental study of these methodologies with a proposed local wrapper for reweighting training instances and it concludes that such a framework can be a more effective solution to the problem. Our method seems to allow improved identification of difficult small classes in predictive analysis, while keeping the classification ability of the other classes at an acceptable level.