Empirical comparisons of various voting methods in bagging

  • Authors:
  • Kelvin T. Leung;D. Stott Parker

  • Affiliations:
  • Los Angeles, California;Los Angeles, California

  • Venue:
  • Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Finding effective methods for developing an ensemble of models has been an active research area of large-scale data mining in recent years. Models learned from data are often subject to some degree of uncertainty, for a variety of resoans. In classification, ensembles of models provide a useful means of averaging out error introduced by individual classifiers, hence reducing the generalization error of prediction.The plurality voting method is often chosen for bagging, because of its simplicity of implementation. However, the plurality approach to model reconciliation is ad-hoc. There are many other voting methods to choose from, including the anti-plurality method, the plurality method with elimination, the Borda count method, and Condorcet's method of pairwise comparisons. Any of these could lead to a better method for reconciliation.In this paper, we analyze the use of these voting methods in model reconciliation. We present empirical results comparing performance of these voting methods when applied in bagging. These results include some surprises, and among other things suggest that (1) plurality is not always the best voting method; (2) the number of classes can affect the performance of voting methods; and (3) the degree of dataset noise can affect the performance of voting methods. While it is premature to make final judgments about specific voting methods, the results of this work raise interesting questions, and they open the door to the application of voting theory in classification theory.