A comparative study on the performance of several ensemble methods with low subsampling ratio

  • Authors:
  • Zaman Faisal;Hideo Hirose

  • Affiliations:
  • Kyushu Institute of Technology, Iizuka-shi, Fukuoka, Japan;Kyushu Institute of Technology, Iizuka-shi, Fukuoka, Japan

  • Venue:
  • ACIIDS'10 Proceedings of the Second international conference on Intelligent information and database systems: Part II
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In ensemble methods each base learner is trained on a resampled version of the original training sample with the same size. In this paper we have used resampling without replacement or subsampling to train base classifiers with low subsample ratio i.e., the size of each subsample is smaller than the original training sample. The main objective of this paper is to check if the scalability performance of several well known ensemble methods with low subsample ratio are competent and compare them with their original counterpart. We have selected three ensemble methods: Bagging, Adaboost and Bundling. In all the ensemble methods a full decision tree is used as the base classifier. We have applied the subsampled version of the above ensembles in several well known benchmark datasets to check the error rate. We have also checked the time complexity of each ensemble method with low subsampling ratio. From the experiments, it is apparent that in the case of bagging and adaboost with low subsampling ratio for most of the cases the error rate is inversely related with subsample size, while for bundling it is opposite. Overall performance of the ensemble methods with low subsampling ratio from experiments showed that bundling is superior in accuracy with low subsampling ratio in almost all the datasets, while bagging is superior in reducing time complexity.