Machine Learning for the Detection of Oil Spills in Satellite Radar Images
Machine Learning - Special issue on applications of machine learning and the knowledge discovery process
Data Mining and Knowledge Discovery
A study of the behavior of several methods for balancing machine learning training data
ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Experimental perspectives on learning from imbalanced data
Proceedings of the 24th international conference on Machine learning
The class imbalance problem: A systematic study
Intelligent Data Analysis
On the use of surrounding neighbors for synthetic over-sampling of the minority class
SMO'08 Proceedings of the 8th conference on Simulation, modelling and optimization
Index of Balanced Accuracy: A Performance Measure for Skewed Class Distributions
IbPRIA '09 Proceedings of the 4th Iberian Conference on Pattern Recognition and Image Analysis
IEEE Transactions on Knowledge and Data Engineering
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
Evolutionary undersampling for classification with imbalanced datasets: Proposals and taxonomy
Evolutionary Computation
Learning from imbalanced data in surveillance of nosocomial infection
Artificial Intelligence in Medicine
Neighbor-weighted K-nearest neighbor for unbalanced text corpus
Expert Systems with Applications: An International Journal
Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning
ICIC'05 Proceedings of the 2005 international conference on Advances in Intelligent Computing - Volume Part I
Hi-index | 0.00 |
The present paper studies the influence of two distinct factors on the performance of some resampling strategies for handling imbalanced data sets. In particular, we focus on the nature of the classifier used, along with the ratio between minority and majority classes. Experiments using eight different classifiers show that the most significant differences are for data sets with low or moderate imbalance: over-sampling clearly appears as better than under-sampling for local classifiers, whereas some under-sampling strategies outperform oversampling when employing classifiers with global learning.