C4.5: programs for machine learning
C4.5: programs for machine learning
Lazy learning
Machine Learning for the Detection of Oil Spills in Satellite Radar Images
Machine Learning - Special issue on applications of machine learning and the knowledge discovery process
Separate-and-Conquer Rule Learning
Artificial Intelligence Review
MetaCost: a general method for making classifiers cost-sensitive
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Using analytic QP and sparseness to speed training of support vector machines
Proceedings of the 1998 conference on Advances in neural information processing systems II
Robust Classification for Imprecise Environments
Machine Learning
Machine Learning
Introduction to Bayesian Networks
Introduction to Bayesian Networks
A Tutorial on Support Vector Machines for Pattern Recognition
Data Mining and Knowledge Discovery
Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey
Data Mining and Knowledge Discovery
ECML '95 Proceedings of the 8th European Conference on Machine Learning
Applying One-Sided Selection to Unbalanced Datasets
MICAI '00 Proceedings of the Mexican International Conference on Artificial Intelligence: Advances in Artificial Intelligence
Bayesian Models for Early Warning of Bank Failures
Management Science
The class imbalance problem: A systematic study
Intelligent Data Analysis
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
Learning when training data are costly: the effect of class distribution on tree induction
Journal of Artificial Intelligence Research
Combinations of weak classifiers
IEEE Transactions on Neural Networks
Support vector machines, Decision Trees and Neural Networks for auditor selection
Journal of Computational Methods in Sciences and Engineering - Intelligent Systems and Knowledge Management
Hi-index | 0.01 |
The problem of imbalanced data sets occurs anytime one class represents a circumscribed concept, while the other represents the counterpart of that concept. The imbalanced data set problem can thus take two distinct forms: either the counterpart class is under-sampled relative to the concept class or it is over-sampled but particularly sparse. In bankruptcy prediction, classifiers are faced with imbalanced datasets: a lot of healthy firms and a smaller number of bankrupt firms. This paper firstly provides a systematic study on the various methodologies that have tried to handle the problem of imbalanced datasets. It presents an experimental study of these methodologies with a proposed technique and it concludes that such a framework can be a more effective solution to the bankruptcy prediction. Our method seems to allow improved identification of difficult small class (bankrupt firms) in predictive analysis, while keeping the classification ability of the other class (healthy firms) in an acceptable level.