Improving risk predictions by preprocessing imbalanced credit data

  • Authors:
  • Vicente García;Ana Isabel Marqués;Jose Salvador Sánchez

  • Affiliations:
  • Dep. Computer Languages and Systems - Institute of New Imaging Technologies, Universitat Jaume I, Castelló de la Plana, Spain;Dep. Business Administration and Marketing, Universitat Jaume I, Castelló de la Plana, Spain;Dep. Computer Languages and Systems - Institute of New Imaging Technologies, Universitat Jaume I, Castelló de la Plana, Spain

  • Venue:
  • ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part II
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Imbalanced credit data sets refer to databases in which the class of defaulters is heavily under-represented in comparison to the class of non-defaulters. This is a very common situation in real-life credit scoring applications, but it has still received little attention. This paper investigates whether data resampling can be used to improve the performance of learners built from imbalanced credit data sets, and whether the effectiveness of resampling is related to the type of classifier. Experimental results demonstrate that learning with the resampled sets consistently outperforms the use of the original imbalanced credit data, independently of the classifier used.