Building Committees by Clustering Models Based on Pairwise Similarity Values

  • Authors:
  • Thomas Ragg

  • Affiliations:
  • -

  • Venue:
  • EMCL '01 Proceedings of the 12th European Conference on Machine Learning
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Forming a committee is an approach for integrating several opinions or functions instead of favouring a single one. Selecting and weighting the committee members is done in several ways by different algorithms. Possible solutions to this problem is still the topic of current research. Our starting point is the decomposition of the committee error into a bias- and variance-like term. Two requests can be derived from this equation: Models should on the one hand be regularized properly to reduce the average error. On the other hand they should be as independent as possible (in the mathematical sense) to decrease the committee error. The first request of regularization can be handled by a Bayesian learning framework. For the second request I want to suggest a new selection method for committee members based on the pairwise stochastical dependence of their output functions, which maximizes the overall independence. Given these pairwise similarity values the models can be separated in classes by a hierarchical clustering algorithm. From the committee error decomposition I derive a criterion that allows to find the optimal number of classes, i.e. the optimal stop criteria for the clustering algorithm. The benefits of the approach are demonstrated on a noisy benchmark problems as well as on the prediction of newspaper sales rates for a large number of retail traders.