Effect of Subsampling Rate on Subbagging and Related Ensembles of Stable Classifiers

Authors:
Faisal Zaman;Hideo Hirose
Affiliations:
Kyushu Institute of Technology, Fukuoka, Japan;Kyushu Institute of Technology, Fukuoka, Japan
Venue:
PReMI '09 Proceedings of the 3rd International Conference on Pattern Recognition and Machine Intelligence
Year:
2009

Citing 3
Cited 4

Bagging predictors

Machine Learning
Stability and generalization

The Journal of Machine Learning Research
Stability of Randomized Learning Algorithms

The Journal of Machine Learning Research

A comparative study on the performance of several ensemble methods with low subsampling ratio

ACIIDS'10 Proceedings of the Second international conference on Intelligent information and database systems: Part II
Estimation of optimal sample size of decision forest with SVM using embedded cross-validation method

ACIIDS'11 Proceedings of the Third international conference on Intelligent information and database systems - Volume Part II
On selecting additional predictive models in double bagging type ensemble method

ICCSA'10 Proceedings of the 2010 international conference on Computational Science and Its Applications - Volume Part IV
DF-SVM: a decision forest constructed on artificially enlarged feature space by support vector machine

Artificial Intelligence Review

Quantified Score

Hi-index	0.00

Visualization

Abstract

In ensemble methods to create multiple classifiers mostly bootstrap sampling method is preferred. The use of subsampling in ensemble creation, produce diverse members for the ensemble and induce instability for stable classifiers. In subsampling the only parameter is the subsample rate that is how much observations we will take from the training sample in each subsample. In this paper we have presented our work on the effect of different subsampling rate (SSR) in bagging type ensemble of stable classifiers, Subbagging and Double Subbagging. We have used three stable classifiers, Linear Support Vector Machine (LSVM), Stable Linear Discriminant Analysis (SLDA) and Logistic Linear Classifier (LOGLC). We also experimented on decision tree to check whether the performance of tree classifier is influenced by different SSR. From the experiment we see that for most of the datasets, the subbagging with stable classifiers in low SSR produces better performance than bagging and single stable classifiers, also in some cases better than double subbagging. We also found an opposite relation between the performance of double subbagging and subbagging.