On the use of surrounding neighbors for synthetic over-sampling of the minority class

Authors:
V. García;J. S. Sánchez;R. A. Mollineda
Affiliations:
Dept. Llenguatges i Sistemes Informàtics, Universitat Jaume I, Castelló de la Plana, Spain;Dept. Llenguatges i Sistemes Informàtics, Universitat Jaume I, Castelló de la Plana, Spain;Dept. Llenguatges i Sistemes Informàtics, Universitat Jaume I, Castelló de la Plana, Spain
Venue:
SMO'08 Proceedings of the 8th conference on Simulation, modelling and optimization
Year:
2008

Citing 15
Cited 2

A new definition of neighborhood of a point in multi-dimensional space

Pattern Recognition Letters
Intelligent Selection of Instances for Prediction Functions in LazyLearning Algorithms

Artificial Intelligence Review - Special issue on lazy learning
Prototype selection for the nearest neighbour rule through proximity graphs

Pattern Recognition Letters
On the use of neighbourhood-based non-parametric classifiers

Pattern Recognition Letters - special issue on pattern recognition in practice V
Machine Learning for the Detection of Oil Spills in Satellite Radar Images

Machine Learning - Special issue on applications of machine learning and the knowledge discovery process
Adaptive Fraud Detection

Data Mining and Knowledge Discovery
A study of the behavior of several methods for balancing machine learning training data

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Using AUC and Accuracy in Evaluating Learning Algorithms

IEEE Transactions on Knowledge and Data Engineering
An introduction to ROC analysis

Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
Experimental perspectives on learning from imbalanced data

Proceedings of the 24th international conference on Machine learning
The class imbalance problem: A systematic study

Intelligent Data Analysis
SMOTE: synthetic minority over-sampling technique

Journal of Artificial Intelligence Research
Learning from imbalanced data in surveillance of nosocomial infection

Artificial Intelligence in Medicine
Neighbor-weighted K-nearest neighbor for unbalanced text corpus

Expert Systems with Applications: An International Journal
Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning

ICIC'05 Proceedings of the 2005 international conference on Advances in Intelligent Computing - Volume Part I

Exploring the performance of resampling strategies for the class imbalance problem

IEA/AIE'10 Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part I
On the effectiveness of preprocessing methods when dealing with different levels of class imbalance

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

It has been observed that class imbalance may produce an important deterioration of the classification accuracy. One of the most popular methods to tackle this problem is the synthetic minority over-sampling technique (SMOTE). From the original SMOTE algorithm, we here propose the use of three surrounding neighborhood approaches with the aim of generating artificial minority examples, but taking both the proximity and the spatial distribution of the examples into account. Experiments with ten real data sets are conducted to compare the models introduced in this paper with SMOTE, demonstrating their effectiveness in a number of problems.