Could Decision Trees Improve the Classification Accuracy and Interpretability of Loan Granting Decisions?

  • Authors:
  • Jozef Zurada

  • Affiliations:
  • -

  • Venue:
  • HICSS '10 Proceedings of the 2010 43rd Hawaii International Conference on System Sciences
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

The paper compares the classification performance rate of eight models: logistic regression (LR), neural network (NN), radial basis function neural network (RBFNN), support vector machine (SVM), case-base reasoning (CBR), and three decision trees (DTs). We build models and test their classification accuracy rates on a historical data set provided by a German financial institution. The data set contains 21 financial attributes of 1000 customers. Though at the time of loan application all individuals deemed to the institution to be qualified to obtain a loan, 300 of them defaulted upon a loan and 700 paid it off. To obtain reliable and unbiased error estimates for each of the eight models we apply 10-fold cross-validation and repeat an experiment 10 times. We found that in the overall classification accuracy rates at 0.5 probability cut-off, two of the three DT models significantly outperformed (at alpha=0.05) the other remaining models. We then concentrate our attention on DT models and compare their performance at 0.3 and 0.7 cut-off levels which are more likely to be used by financial institutions. The DT models not only classify better than the other models, but the knowledge they learn in the form of if-then rules is easy to interpret, makes sense, and might be of value to financial institutions which may have to explain the reasons for a loan denial.