The impact of random samples in ensemble classifiers

Authors:
Paulo Fernandes;Lucelene Lopes;Duncan D. A. Ruiz
Affiliations:
PPGCC - FACIN -PUCRS, Porto Alegre, Brazil;PPGCC - FACIN -PUCRS, Porto Alegre, Brazil;PPGCC - FACIN -PUCRS, Porto Alegre, Brazil
Venue:
Proceedings of the 2010 ACM Symposium on Applied Computing
Year:
2010

Citing 13
Cited 1

Bagging predictors

Machine Learning
Unsupervised stratification of cross-validation for accuracy estimation

Artificial Intelligence
An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization

Machine Learning
Machine Learning

Machine Learning
An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants

Machine Learning
Induction of Decision Trees

Machine Learning
A Comparison of Stacking with Meta Decision Trees to Bagging, Boosting, and Stacking with other Methods

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Using boosting to prune bagging ensembles

Pattern Recognition Letters
Using Boosting to prune Double-Bagging ensembles

Computational Statistics & Data Analysis
A study of cross-validation and bootstrap for accuracy estimation and model selection

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Bagging, boosting, and C4.S

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Comparing ensembles of learners: detecting prostate cancer from high resolution MRI

CVAMIA'06 Proceedings of the Second ECCV international conference on Computer Vision Approaches to Medical Image Analysis

Evolutionary model trees for handling continuous classes in machine learning

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

The use of ensemble classifiers, e.g., Bagging and Boosting, is wide spread to machine learning. However, most of studies in this area are based on empirical comparisons that suffer from a lack of care to the randomness of these methods. This paper describes the dangers of experiments with ensemble classifiers by analyzing the efficiency of Bagging and Boosting methods over 32 different data sets. The experiments show that variations due to randomness are often more relevant than the advantages among methods encountered in the literature. This paper main contribution is the claim, supported by statistical analysis, that no empirical comparison of ensemble classifiers can be scientifically done without paying attention to the random choices taken.