An ensemble dependence measure

Authors:
Matthew Prior;Terry Windeatt
Affiliations:
Centre for Vision Speech and Signal Processing, University of Surrey, Surrey, UK;Centre for Vision Speech and Signal Processing, University of Surrey, Surrey, UK
Venue:
ICANN'07 Proceedings of the 17th international conference on Artificial neural networks
Year:
2007

Citing 5
Cited 0

Bagging predictors

Machine Learning
Estimating Generalization Error on Two-Class Datasets Using Out-of-Bag Estimates

Machine Learning
Parameter Tuning using the Out-of-Bootstrap Generalisation Error Estimate for Stochastic Discrimination and Random Forests

ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 02
Over-Fitting in ensembles of neural network classifiers within ECOC frameworks

MCS'05 Proceedings of the 6th international conference on Multiple Classifier Systems
Dynamics of variance reduction in bagging and other techniques based on randomisation

MCS'05 Proceedings of the 6th international conference on Multiple Classifier Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Ensemble methods in supervised classification problems have been shown to be superior to single base classifiers of comparable performance, particularly when used in conjunction with multi-layer perceptron base classifiers. An ensemble's performance is related to the accuracy and diversity of its component classifiers. Intuitively, diversity seems to be a desirable quality for a collection of non-optimal classifiers. Despite much interest being shown in diversity, little progress has been made in linking generalisation performance to any specific diversity metric. With the agglomeration of even modestly accurate statistically independent classifiers it can be shown theoretically that ensemble accuracy can be forced close to optimality. Despite this theoretical benchmark, real world ensembles fall far short of this performance. The root of this problem is the lack of statistical independence amongst the base classifiers. We investigate a measure of statistical dependence in ensembles, D, and its relationship to the Q diversity metric and pairwise correlation and also examine voting patterns in real world ensembles. We show that, whilst Q is relatively insensitive to changes in the ensemble configuration D measures correlations between the base classifiers effectively. The experiments are based on several two class problems from the UCI data sets and use bootstrapped ensembles of relatively weak, multi-layer perceptron, base classifiers.