Clustering ensembles of neural network models

  • Authors:
  • Bart Bakker;Tom Heskes

  • Affiliations:
  • SNN, University of Nijmegen, Geert Grooteplein 21, 6525 EZ Nijmegen, The Netherlands;SNN, University of Nijmegen, Geert Grooteplein 21, 6525 EZ Nijmegen, The Netherlands

  • Venue:
  • Neural Networks
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

We show that large ensembles of (neural network) models, obtained e.g. in bootstrapping or sampling from (Bayesian) probability distributions, can be effectively summarized by a relatively small number of representative models. In some cases this summary may even yield better function estimates. We present a method to find representative models through clustering based on the models' outputs on a data set. We apply the method on an ensemble of neural network models obtained from bootstrapping on the Boston housing data, and use the results to discuss bootstrapping in terms of bias and variance. A parallel application is the prediction of newspaper sales, where we learn a series of parallel tasks. The results indicate that it is not necessary to store all samples in the ensembles: a small number of representative models generally matches, or even surpasses, the performance of the full ensemble. The clustered representation of the ensemble obtained thus is much better suitable for qualitative analysis, and will be shown to yield new insights into the data.