Robust classification of imbalanced data using one-class and two-class SVM-based multiclassifiers

  • Authors:
  • Sebastián Maldonado;Claudio Montecinos

  • Affiliations:
  • Universidad de Los Andes, Mons. Alvaro del Portillo, Las Condes, Santiago, Chile;Operations Management Master Program, Universidad de Talca, Curicó, Chile

  • Venue:
  • Intelligent Data Analysis - Business Analytics and Intelligent Optimization
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

200 words for Intelligent Data Systems The class imbalance problem is a relatively new challenge that has attracted growing attention from both industry and academia, since it strongly affects classification performance. Research also established that class imbalance is not an issue by itself, but its relationship with class overlapping and noise has an important impact on the prediction performance and stability. This fact has motivated the development of several approaches for classification of imbalanced data see e.g. [29,39]. In this paper, we present credit card customer churn prediction, an important topic in business analytics, using an ensemble of classifiers. Since this problem is considered as highly imbalanced, we employ different techniques for classification, such as Support Vector Data Description SVDD and two-class SVMs. The main idea is to address both class imbalance and class overlapping by stacking different classification approaches, while evaluating the diversity of the individual classifiers considering meta-learning measures. We performed experiments on artificial data sets and one real customer churn prediction problem from a Chilean financial entity, comparing our approach with well-known classification techniques for imbalanced data. The proposed strategy achieves an improvement of 6.1% over the best individual classifier in terms of predictive performance, providing accurate and robust classification models for different levels of balance and noise.