Using data mining to improve assessment of credit worthiness via credit scoring models

  • Authors:
  • Bee Wah Yap;Seng Huat Ong;Nor Huselina Mohamed Husain

  • Affiliations:
  • Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, 40450 Shah Alam, Selangor, Malaysia;Institute of Mathematical Sciences, University of Malaya, 50603 Kuala Lumpur, Malaysia;Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, 40450 Shah Alam, Selangor, Malaysia

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2011

Quantified Score

Hi-index 12.05

Visualization

Abstract

Credit scoring model have been developed by banks and researchers to improve the process of assessing credit worthiness during the credit evaluation process. The objective of credit scoring models is to assign credit risk to either a ''good risk'' group that is likely to repay financial obligation or a ''bad risk'' group who has high possibility of defaulting on the financial obligation. Construction of credit scoring models requires data mining techniques. Using historical data on payments, demographic characteristics and statistical techniques, credit scoring models can help identify the important demographic characteristics related to credit risk and provide a score for each customer. This paper illustrates using data mining to improve assessment of credit worthiness using credit scoring models. Due to privacy concerns and unavailability of real financial data from banks this study applies the credit scoring techniques using data of payment history of members from a recreational club. The club has been facing a problem of rising number in defaulters in their monthly club subscription payments. The management would like to have a model which they can deploy to identify potential defaulters. The classification performance of credit scorecard model, logistic regression model and decision tree model were compared. The classification error rates for credit scorecard model, logistic regression and decision tree were 27.9%, 28.8% and 28.1%, respectively. Although no model outperforms the other, scorecards are relatively much easier to deploy in practical applications.