Exploring the performance of resampling strategies for the class imbalance problem

Authors:
Vicente García;José Salvador Sánchez;Ramón A. Mollineda
Affiliations:
Institute of New Imaging Technologies, Dept. Llenguatges i Sistemes Informàtics, Universitat Jaume I, Castelló de la Plana, Spain;Institute of New Imaging Technologies, Dept. Llenguatges i Sistemes Informàtics, Universitat Jaume I, Castelló de la Plana, Spain;Institute of New Imaging Technologies, Dept. Llenguatges i Sistemes Informàtics, Universitat Jaume I, Castelló de la Plana, Spain
Venue:
IEA/AIE'10 Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part I
Year:
2010

Citing 14
Cited 0

Machine Learning for the Detection of Oil Spills in Satellite Radar Images

Machine Learning - Special issue on applications of machine learning and the knowledge discovery process
Adaptive Fraud Detection

Data Mining and Knowledge Discovery
A study of the behavior of several methods for balancing machine learning training data

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Experimental perspectives on learning from imbalanced data

Proceedings of the 24th international conference on Machine learning
The class imbalance problem: A systematic study

Intelligent Data Analysis
On the use of surrounding neighbors for synthetic over-sampling of the minority class

SMO'08 Proceedings of the 8th conference on Simulation, modelling and optimization
Index of Balanced Accuracy: A Performance Measure for Skewed Class Distributions

IbPRIA '09 Proceedings of the 4th Iberian Conference on Pattern Recognition and Image Analysis
Learning from Imbalanced Data

IEEE Transactions on Knowledge and Data Engineering
SMOTE: synthetic minority over-sampling technique

Journal of Artificial Intelligence Research
Evolutionary undersampling for classification with imbalanced datasets: Proposals and taxonomy

Evolutionary Computation
Learning from imbalanced data in surveillance of nosocomial infection

Artificial Intelligence in Medicine
Neighbor-weighted K-nearest neighbor for unbalanced text corpus

Expert Systems with Applications: An International Journal
Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning

ICIC'05 Proceedings of the 2005 international conference on Advances in Intelligent Computing - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

The present paper studies the influence of two distinct factors on the performance of some resampling strategies for handling imbalanced data sets. In particular, we focus on the nature of the classifier used, along with the ratio between minority and majority classes. Experiments using eight different classifiers show that the most significant differences are for data sets with low or moderate imbalance: over-sampling clearly appears as better than under-sampling for local classifiers, whereas some under-sampling strategies outperform oversampling when employing classifiers with global learning.