A Novel Differential Evolution-Clustering Hybrid Resampling Algorithm on Imbalanced Datasets

  • Authors:
  • Leichen Chen;Zhihua Cai;Lu Chen;Qiong Gu

  • Affiliations:
  • -;-;-;-

  • Venue:
  • WKDD '10 Proceedings of the 2010 Third International Conference on Knowledge Discovery and Data Mining
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

When dealing with the imbalanced datasets (IDS), the hyperplane of Support vector machine (SVM) tends to minority class (positive class), which causes low classification accuracy. Aiming at this problem, we propose a novel differential evolution-clustering hybrid resampling SVM algorithm (DEC-SVM). This algorithm utilizes the similar mutation and crossover operators of Differential Evolution (DE) for over-sampling to enlarge the ratio of positive samples, and then we apply clustering to the over-sampled training dataset as a data cleaning method for both classes, removing the redundant or noisy samples. Experimental results show that our method DEC-SVM performs better, compared with standard SVM, SMOTE-SVM and DE-SVM under the criterion of F-measure and ROC Area (AUC) upon ten different UCI standard datasets.