On selecting additional predictive models in double bagging type ensemble method

Authors:
Zaman Faisal;Mohammad Mesbah Uddin;Hideo Hirose
Affiliations:
Kyushu Institute of Technology, Kawazu, Iizuka, Japan;Kyushu University, Fukuoka, Japan;Kyushu University, Fukuoka, Japan
Venue:
ICCSA'10 Proceedings of the 2010 international conference on Computational Science and Its Applications - Volume Part IV
Year:
2010

Citing 11
Cited 0

Bagging predictors

Machine Learning
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Random Forests

Machine Learning
Combining Pattern Classifiers: Methods and Algorithms

Combining Pattern Classifiers: Methods and Algorithms
Ensemble selection from libraries of models

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Rotation Forest: A New Classifier Ensemble Method

IEEE Transactions on Pattern Analysis and Machine Intelligence
Getting the Most Out of Ensemble Selection

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Statistical Comparisons of Classifiers over Multiple Data Sets

The Journal of Machine Learning Research
Bundling classifiers by bagging trees

Computational Statistics & Data Analysis
Effect of Subsampling Rate on Subbagging and Related Ensembles of Stable Classifiers

PReMI '09 Proceedings of the 3rd International Conference on Pattern Recognition and Machine Intelligence
A comparative study on the performance of several ensemble methods with low subsampling ratio

ACIIDS'10 Proceedings of the Second international conference on Intelligent information and database systems: Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

Double Bagging is a parallel ensemble method, where an additional classifier model is trained on the out-of-bag samples and then the posteriori class probabilities of this additional classifier are added with the inbag samples to train a decision tree classifier. The subsampled version of double bagging depend on two hyper parameters, subsample ratio (SSR) and an additional classifier. In this paper we have proposed an embedded cross-validation based selection technique to select one of these parameters automatically. This selection technique builds different ensemble classifier models with each of these parameter values (keeping another fixed) during the training phase of the ensemble method and finally select the one with the highest accuracy. We have used four additional classifier models, Radial Basis Support Vector Machine (RSVM), Linear Support Vector Machine (LSVM), Nearest Neighbor Classifier (5-NN and 10-NN) with five subsample ratios (SSR), 0.1, 0.2, 0.3, 0.4 and 0.5. We have reported the performance of the subsampled double bagging ensemble with these SSRs with each of these additional classifiers. In our experiments we have used UCI benchmark datasets. The results indicate that LSVM has superior performance as an additional classifiers in enhancing the predictive power of double bagging, where as with SSR 0.4 and 0.5 double bagging has better performance, than with other SSRs. We have also compared the performance of these resulting ensemble methods with Bagging, Adaboost, Double Bagging (original) and Rotation Forest. Experimental results show that the performance of the resulting subsampled double bagging ensemble is better than these ensemble methods.