Solving Credit Scoring Problem with Ensemble Learning: A Case Study

  • Authors:
  • Hongrui Xie;Shuli Han;Xinyi Shu;Xinzhu Yang;Xiuyun Qu;Shiqiang Zheng

  • Affiliations:
  • -;-;-;-;-;-

  • Venue:
  • KAM '09 Proceedings of the 2009 Second International Symposium on Knowledge Acquisition and Modeling - Volume 01
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Managing customer credit is an important issue in the banking industry and should always be done in an automatic way, with credit scoring trusted. This paper presents our solution to PAKDD 2009 data mining competition as a case study of the credit scoring problem. Following a brief description of the data mining task, several challenges confronted in the task such as imbalanced dataset, missing values and data transformation are discussed. After series of preliminary experiments, logistic regression and AdaBoost were shown as the resulting classifiers on this particular problem. Furthermore, an ensemble of the two classifiers was created in order to achieve even better performance. The final result shows that our solution is effective and efficient with an AUC value of 0.6535, which was the fifth best result among more than 100 competitive teams.