Computer systems that learn: classification and prediction methods from statistics, neural nets, machine learning, and expert systems
Learning internal representations by error propagation
Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
Instance-Based Learning Algorithms
Machine Learning
C4.5: programs for machine learning
C4.5: programs for machine learning
The KDD process for extracting useful knowledge from volumes of data
Communications of the ACM
Predictive data mining: a practical guide
Predictive data mining: a practical guide
Using Feature Construction to Improve the Performance of Neural Networks
Management Science
Constructing X-of-N Attributes for Decision Tree Learning
Machine Learning
Methodological and practical aspects of data mining
Information and Management
Feature Extraction, Construction and Selection: A Data Mining Perspective
Feature Extraction, Construction and Selection: A Data Mining Perspective
Feature Generation Using General Constructor Functions
Machine Learning
Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey
Data Mining and Knowledge Discovery
Data Representation for Diagnostic Neural Networks
IEEE Expert: Intelligent Systems and Their Applications
Machine Learning
The Case against Accuracy Estimation for Comparing Induction Algorithms
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Introduction to the special issue on the fusion of domain knowledge with data for decision support
The Journal of Machine Learning Research
Bayesian Models for Early Warning of Bank Failures
Management Science
International Journal of Intelligent Systems in Accounting and Finance Management
International Journal of Intelligent Systems in Accounting and Finance Management
Dynamics of modeling in data mining: interpretive approach to bankruptcy prediction
Journal of Management Information Systems - Special section: Data mining
Journal of Management Information Systems - Special section: Data mining
Evaluating and Tuning Predictive Data Mining Models Using Receiver Operating Characteristic Curves
Journal of Management Information Systems
A Comparison of Seven Techniques for Choosing Subsets of Pattern Recognition Properties
IEEE Transactions on Computers
Instance weighting versus threshold adjusting for cost-sensitive classification
Knowledge and Information Systems
The foundations of cost-sensitive learning
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
A study of cross-validation and bootstrap for accuracy estimation and model selection
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans
Fuzzy Support Vector Machine for bankruptcy prediction
Applied Soft Computing
Tuning expert systems for cost-sensitive decisions
Advances in Artificial Intelligence
Expert Systems with Applications: An International Journal
Expert Systems with Applications: An International Journal
Financial ratio selection for business crisis prediction
Expert Systems with Applications: An International Journal
A survey of multiple classifier systems as hybrid systems
Information Fusion
Novel feature selection methods to financial distress prediction
Expert Systems with Applications: An International Journal
Hi-index | 12.06 |
While extensive research in data mining has been devoted to developing better classification algorithms, relatively little research has been conducted to examine the effects of feature construction, guided by domain knowledge, on classification performance. However, in many application domains, domain knowledge can be used to construct higher-level features to potentially improve performance. For example, past research and regulatory practice in early warning of bank failures has resulted in various explanatory variables, in the form of financial ratios, that are constructed based on bank accounting variables and are believed to be more effective than the original variables in identifying potential problem banks. In this study, we empirically compare the performance of two sets of classifiers for bank failure prediction, one built using raw accounting variables and the other built using constructed financial ratios. Four popular data mining methods are used to learn the classifiers: logistic regression, decision tree, neural network, and k-nearest neighbor. We evaluate the classifiers on the basis of expected misclassification cost under a wide range of possible settings. The results of the study strongly indicate that feature construction, guided by domain knowledge, significantly improves classifier performance and that the degree of improvement varies significantly across the methods.