Machine Learning
Noisy replication in skewed binary classification
Computational Statistics & Data Analysis
Support Vector Machines for Classification in Nonstandard Situations
Machine Learning
An Instance-Weighting Method to Induce Cost-Sensitive Trees
IEEE Transactions on Knowledge and Data Engineering
Choosing k for two-class nearest neighbour classifiers with unbalanced classes
Pattern Recognition Letters
Mining with rarity: a unifying framework
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
A study of the behavior of several methods for balancing machine learning training data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Class imbalances versus small disjuncts
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Does cost-sensitive learning beat sampling for classifying rare classes?
UBDM '05 Proceedings of the 1st international workshop on Utility-based data mining
Training Cost-Sensitive Neural Networks with Methods Addressing the Class Imbalance Problem
IEEE Transactions on Knowledge and Data Engineering
The relationship between Precision-Recall and ROC curves
ICML '06 Proceedings of the 23rd international conference on Machine learning
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
Boosted Classification Trees and Class Probability/Quantile Estimation
The Journal of Machine Learning Research
Cost-sensitive boosting for classification of imbalanced data
Pattern Recognition
The class imbalance problem: A systematic study
Intelligent Data Analysis
An Empirical Study of Learning from Imbalanced Data Using Random Forest
ICTAI '07 Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence - Volume 02
Learning Decision Trees for Unbalanced Data
ECML PKDD '08 Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases - Part I
Handling class imbalance in customer churn prediction
Expert Systems with Applications: An International Journal
IEEE Transactions on Knowledge and Data Engineering
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
Exploratory undersampling for class-imbalance learning
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
FSVM-CIL: fuzzy support vector machines for class imbalance learning
IEEE Transactions on Fuzzy Systems - Special section on computing with words
Combating the Small Sample Class Imbalance Problem Using Feature Selection
IEEE Transactions on Knowledge and Data Engineering
A parallel neural network approach to prediction of Parkinson's Disease
Expert Systems with Applications: An International Journal
Evolutionary-based selection of generalized instances for imbalanced classification
Knowledge-Based Systems
Mitotic HEp-2 cells recognition under class skew
ICIAP'11 Proceedings of the 16th international conference on Image analysis and processing - Volume Part II
Optimisation and evaluation of random forests for imbalanced datasets
ISMIS'06 Proceedings of the 16th international conference on Foundations of Intelligent Systems
Application of bootstrap and other resampling techniques: Evaluation of classifier performance
Pattern Recognition Letters
Comparing Boosting and Bagging Techniques With Noisy and Imbalanced Data
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Hi-index | 0.00 |
The problem of modeling binary responses by using cross-sectional data has been addressed with a number of satisfying solutions that draw on both parametric and nonparametric methods. However, there exist many real situations where one of the two responses (usually the most interesting for the analysis) is rare. It has been largely reported that this class imbalance heavily compromises the process of learning, because the model tends to focus on the prevalent class and to ignore the rare events. However, not only the estimation of the classification model is affected by a skewed distribution of the classes, but also the evaluation of its accuracy is jeopardized, because the scarcity of data leads to poor estimates of the model's accuracy. In this work, the effects of class imbalance on model training and model assessing are discussed. Moreover, a unified and systematic framework for dealing with the problem of imbalanced classification is proposed, based on a smoothed bootstrap re-sampling technique. The proposed technique is founded on a sound theoretical basis and an extensive empirical study shows that it outperforms the main other remedies to face imbalanced learning problems.