A Bayesian latent variable model with classification and regression tree approach for behavior and credit scoring

  • Authors:
  • Ling-Jing Kao;Chih-Chou Chiu;Fon-Yu Chiu

  • Affiliations:
  • Department of Business Management, National Taipei University of Technology, Taiwan;Department of Business Management, National Taipei University of Technology, Taiwan;Institute of Commerce Automation and Management, National Taipei University of Technology, Taiwan

  • Venue:
  • Knowledge-Based Systems
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

A Bayesian latent variable model with classification and regression tree approach is built to overcome three challenges encountered by a bank in credit-granting process. These three challenges include (1) the bank wants to predict the future performance of an applicant accurately; (2) given current information about cardholders' credit usage and repayment behavior, financial institutions would like to determine the optimal credit limit and APR for an applicant; and (3) the bank would like to improve its efficiency by automating the process of credit-granting decisions. Data from a leading bank in Taiwan is used to illustrate the combined approach. The data set consists of each credit card holder's credit usage and repayment data, demographic information, and credit report. Empirical study shows that the demographic variables used in most credit scoring models have little explanatory ability with regard to a cardholder's credit usage and repayment behavior. A cardholder's credit history provides the most important information in credit scoring. The continuous latent customer quality from the Bayesian latent variable model allows considerable latitude for producing finer rules for credit granting decisions. Compared to the performance of discriminant analysis, logistic regression, neural network, multivariate adaptive regression splines (MARS) and support vector machine (SVM), the proposed model has a 92.9% accuracy rate in predicting customer types, is less impacted by prior probabilities, and has a significantly low Type I errors in comparison with the other five approaches.