Safe-Level-SMOTE: Safe-Level-Synthetic Minority Over-Sampling TEchnique for Handling the Class Imbalanced Problem

Authors:
Chumphol Bunkhumpornpat;Krung Sinapiromsaran;Chidchanok Lursinsap
Affiliations:
Department of Mathematics, Faculty of Science, Chulalongkorn University,;Department of Mathematics, Faculty of Science, Chulalongkorn University,;Department of Mathematics, Faculty of Science, Chulalongkorn University,
Venue:
PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Year:
2009

Citing 12
Cited 9

C4.5: programs for machine learning

C4.5: programs for machine learning
The relationship between recall and precision

Journal of the American Society for Information Science
Machine Learning for the Detection of Oil Spills in Satellite Radar Images

Machine Learning - Special issue on applications of machine learning and the knowledge discovery process
MetaCost: a general method for making classifiers cost-sensitive

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
An introduction to support Vector Machines: and other kernel-based learning methods

An introduction to support Vector Machines: and other kernel-based learning methods
Data mining: concepts and techniques

Data mining: concepts and techniques
AdaCost: Misclassification Cost-Sensitive Boosting

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Using Artificial Anomalies to Detect Unknown and Known Network Intrusions

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Editorial: special issue on learning from imbalanced data sets

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
SMOTE: synthetic minority over-sampling technique

Journal of Artificial Intelligence Research
The use of the area under the ROC curve in the evaluation of machine learning algorithms

Pattern Recognition
Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning

ICIC'05 Proceedings of the 2005 international conference on Advances in Intelligent Computing - Volume Part I

Class confidence weighted kNN algorithms for imbalanced data sets

PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II
Analysis of preprocessing vs. cost-sensitive learning for imbalanced classification. Open problems on intrinsic data characteristics

Expert Systems with Applications: An International Journal
Mixed-sampling approach to unbalanced data distributions: a case study involving Leukemia's document profiling

WSEAS Transactions on Information Science and Applications
DBSMOTE: Density-Based Synthetic Minority Over-sampling TEchnique

Applied Intelligence
A pruning-based approach for searching precise and generalized region for synthetic minority over-sampling

PAKDD'12 Proceedings of the 16th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part II
Improving risk predictions by preprocessing imbalanced credit data

ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part II
Analysing the classification of imbalanced data-sets with multiple classes: Binarization techniques and ad-hoc approaches

Knowledge-Based Systems
Empirical study of bagging predictors on medical data

AusDM '11 Proceedings of the Ninth Australasian Data Mining Conference - Volume 121
Addressing imbalanced classification with instance generation techniques: IPADE-ID

Neurocomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The class imbalanced problem occurs in various disciplines when one of target classes has a tiny number of instances comparing to other classes. A typical classifier normally ignores or neglects to detect a minority class due to the small number of class instances. SMOTE is one of over-sampling techniques that remedies this situation. It generates minority instances within the overlapping regions. However, SMOTE randomly synthesizes the minority instances along a line joining a minority instance and its selected nearest neighbours, ignoring nearby majority instances. Our technique called Safe-Level-SMOTE carefully samples minority instances along the same line with different weight degree, called safe level. The safe level computes by using nearest neighbour minority instances. By synthesizing the minority instances more around larger safe level, we achieve a better accuracy performance than SMOTE and Borderline-SMOTE.