Learning without default: a study of one-class classification and the low-default portfolio problem

Authors:
Kenneth Kennedy;Brian Mac Namee;Sarah Jane Delany
Affiliations:
School of Computing, Dublin Institute of Technology, Dublin, Ireland;School of Computing, Dublin Institute of Technology, Dublin, Ireland;Digital Media Centre, Dublin Institute of Technology, Dublin, Ireland
Venue:
AICS'09 Proceedings of the 20th Irish conference on Artificial intelligence and cognitive science
Year:
2009

Citing 21
Cited 1

Simplifying decision trees

International Journal of Man-Machine Studies - Special Issue: Knowledge Acquisition for Knowledge-based Systems. Part 5
The nature of statistical learning theory

The nature of statistical learning theory
Outliers in statistical pattern recognition and an application to automatic chromosome classification

Pattern Recognition Letters
Data mining: practical machine learning tools and techniques with Java implementations

Data mining: practical machine learning tools and techniques with Java implementations
Support vector domain description

Pattern Recognition Letters - Special issue on pattern recognition in practice VI
Explicitly representing expected cost: an alternative to ROC representation

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Neural network credit scoring models

Computers and Operations Research - Neural networks in business
Information Retrieval

Information Retrieval
Support Vector Data Description

Machine Learning
Mining with rarity: a unifying framework

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
A study of the behavior of several methods for balancing machine learning training data

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Extreme re-balancing for SVMs: a case study

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
A Survey of Outlier Detection Methodologies

Artificial Intelligence Review
Estimating the Support of a High-Dimensional Distribution

Neural Computation
The relationship between Precision-Recall and ROC curves

ICML '06 Proceedings of the 23rd international conference on Machine learning
Focusing on non-respondents: Response modeling with novelty detectors

Expert Systems with Applications: An International Journal
The class imbalance problem: A systematic study

Intelligent Data Analysis
A novelty detection approach to classification

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
Building credit scoring models using genetic programming

Expert Systems with Applications: An International Journal
The novelty detection approach for different degrees of class imbalance

ICONIP'06 Proceedings of the 13th international conference on Neural Information Processing - Volume Part II
Evaluating misclassifications in imbalanced data

ECML'06 Proceedings of the 17th European conference on Machine Learning

Improving risk predictions by preprocessing imbalanced credit data

ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper asks at what level of class imbalance one-class classifiers outperform two-class classifiers in credit scoring problems in which class imbalance, referred to as the low-default portfolio problem, is a serious issue. The question is answered by comparing the performance of a variety of one-class and two-class classifiers on a selection of credit scoring datasets as the class imbalance is manipulated. We also include random oversampling as this is one of the most common approaches to addressing class imbalance. This study analyses the suitability and performance of recognised two-class classifiers and one-class classifiers. Based on our study we conclude that the performance of the two-class classifiers deteriorates proportionally to the level of class imbalance. The two-class classifiers outperform one-class classifiers with class imbalance levels down as far as 15% (i.e. the imbalance ratio of minority class to majority class is 15:85). The one-class classifiers, whose performance remains unvaried throughout, are preferred when the minority class constitutes approximately 2% or less of the data. Between an imbalance of 2% to 15% the results are not as conclusive. These results show that one-class classifiers could potentially be used as a solution to the low-default portfolio problem experienced in the credit scoring domain.