Enhancing the classification accuracy by scatter-search-based ensemble approach

Authors:
Shih-Chieh Chen;Shih-Wei Lin;Shuo-Yan Chou
Affiliations:
Department of Industrial Management, National Taiwan University of Science and Technology, No. 43, Section 4, Keelung Road, Taipei 106, Taiwan, ROC;Department of Information Management, Chang Gung University, No. 259 Wen-Hwa 1st Road, Kwei-Shan, Tao-Yuan 333, Taiwan, ROC;Department of Industrial Management, National Taiwan University of Science and Technology, No. 43, Section 4, Keelung Road, Taipei 106, Taiwan, ROC
Venue:
Applied Soft Computing
Year:
2011

Citing 33
Cited 6

Simplifying decision trees

International Journal of Man-Machine Studies - Special Issue: Knowledge Acquisition for Knowledge-based Systems. Part 5
Original Contribution: Stacked generalization

Neural Networks
C4.5: programs for machine learning

C4.5: programs for machine learning
Stacked regressions

Machine Learning
Bagging predictors

Machine Learning
On Combining Classifiers

IEEE Transactions on Pattern Analysis and Machine Intelligence
Prediction games and arcing algorithms

Neural Computation
Evolving Multilayer Perceptrons

Neural Processing Letters
Combining Artificial Neural Nets: Ensemble and Modular Multi-Net Systems

Combining Artificial Neural Nets: Ensemble and Modular Multi-Net Systems
On Comparing Classifiers: Pitfalls toAvoid and a Recommended Approach

Data Mining and Knowledge Discovery
A Tutorial on Support Vector Machines for Pattern Recognition

Data Mining and Knowledge Discovery
An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants

Machine Learning
Induction of Decision Trees

Machine Learning
A Template for Scatter Search and Path Relinking

AE '97 Selected Papers from the Third European Conference on Artificial Evolution
Scatter Search: Methodology and Implementations in C

Scatter Search: Methodology and Implementations in C
Cross-Validated C4.5: Using Error Estimation for Automatic Parameter Selection

Cross-Validated C4.5: Using Error Estimation for Automatic Parameter Selection
Neural network ensemble strategies for financial decision applications

Computers and Operations Research
Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques
2005 Special Issue: Bayesian approach to feature selection and parameter tuning for support vector machine classifiers

Neural Networks - 2005 Special issue: IJCNN 2005
Non-parametric classifier-independent feature selection

Pattern Recognition
Ensemblator: An ensemble of classifiers for reliable classification of biological data

Pattern Recognition Letters
Ex-ray: Data mining and mental health

Applied Soft Computing
A co-evolving decision tree classification method

Expert Systems with Applications: An International Journal
Classification tree analysis using TARGET

Computational Statistics & Data Analysis
A simulated-annealing-based approach for simultaneous parameter optimization and feature selection of back-propagation networks

Expert Systems with Applications: An International Journal
Using neural network ensembles for bankruptcy prediction and credit scoring

Expert Systems with Applications: An International Journal
Support vector machines based on K-means clustering for real-time business intelligence systems

International Journal of Business Intelligence and Data Mining
Particle swarm optimization for parameter determination and feature selection of support vector machines

Expert Systems with Applications: An International Journal
Parameter determination of support vector machine and feature selection using simulated annealing approach

Applied Soft Computing
Parameter determination and feature selection for back-propagation network by particle swarm optimization

Knowledge and Information Systems
A comparison of supervised and unsupervised neural networks in predicting bankruptcy of Korean firms

Expert Systems with Applications: An International Journal
Medical diagnosis with C4.5 rule preceded by artificial neural network ensemble

IEEE Transactions on Information Technology in Biomedicine
Lung cancer cell identification based on artificial neural network ensembles

Artificial Intelligence in Medicine

Parameter tuning, feature selection and weight assignment of features for case-based reasoning by artificial immune system

Applied Soft Computing
A novel hybrid classification model of artificial neural networks and multiple linear regression models

Expert Systems with Applications: An International Journal
Class proximity measures - Dissimilarity-based classification and display of high-dimensional data

Journal of Biomedical Informatics
A simulation study for the distribution law of relative moments of evolution

Complexity
Discrete Artificial Bee Colony Optimization Algorithm for Financial Classification Problems

International Journal of Applied Metaheuristic Computing
Hybridising harmony search with a Markov blanket for gene selection problems

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data-mining algorithms have been used in many classification problems. Among them, the decision tree (DT), back-propagation network (BPN), and support vector machine (SVM) are popular and can be applied to various areas. Nevertheless, different problems may require different parameter values when applying DT, BPN or SVM. If parameter values are not set well, results may turn out to be unsatisfactory. Further, a dataset may contain many features; however, not all features are beneficial for classifications. Therefore, a scatter search (SS) approach is proposed to obtain the better parameters and select the beneficial subset of features to attain better classification results. The above classification algorithms have their respective advantages and disadvantages, and suitability is influenced by the characteristics of the problem. If the algorithms can function together in a so-called ensemble, it is expected that better results can be obtained. Therefore, this study adapts ensemble to further enhance the classification accuracy rate. In order to evaluate the performance of the proposed approach, datasets in UCI (University of California, Irvine) were applied as the test problem set. The corresponding results were compared to several well-known, published approaches. The comparative study shows that the proposed approach improved the classification accuracy rate in most datasets. Thus, the proposed approach can be useful to both practitioners and researchers.