C4.5: programs for machine learning
C4.5: programs for machine learning
The power of amnesia: learning probabilistic automata with variable memory length
Machine Learning - Special issue on COLT '94
Modeling protein families using probabilistic suffix trees
RECOMB '99 Proceedings of the third annual international conference on Computational molecular biology
Approximate nearest neighbors and sequence comparison with block operations
STOC '00 Proceedings of the thirty-second annual ACM symposium on Theory of computing
Overcoming the Memory Bottleneck in Suffix Tree Construction
FOCS '98 Proceedings of the 39th Annual Symposium on Foundations of Computer Science
Improving clustering analysis for credit card accounts classification
ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part III
classifications of credit cardholder behavior by using multiple criteria non-linear programming
CASDMKM'04 Proceedings of the 2004 Chinese academy of sciences conference on Data Mining and Knowledge Management
Hi-index | 12.05 |
A personal bankruptcy prediction system running on credit card data is proposed. Personal bankruptcy, which usually results in significant losses to creditors, is a rapidly increasing yet little understood phenomenon. The most commonly used methods in personal bankruptcy prediction are credit scoring models. Some data mining models have also been investigated in this domain. Neither the scoring models nor the existing data mining methods adequately take sequence information in credit card data into account. In our system, sequence patterns, obtained by developing sequence mining techniques and applying them to credit card data from one major Canadian bank, are employed as main predictors. The mined sequence patterns, which we refer to as bankruptcy features, are represented in low-dimensional vector space. From the new feature space, which can be extended with some existing prediction-capable features (e.g., credit score), a support vector machine (SVM) classifier is built to combine these mined and already existing features. Our system is readily comprehensible and demonstrates promising prediction performance.