To select or to weigh: a comparative study of model selection and model weighing for SPODE ensembles

  • Authors:
  • Ying Yang;Geoff Webb;Jesús Cerquides;Kevin Korb;Janice Boughton;Kai Ming Ting

  • Affiliations:
  • Clayton School of Information Technology, Monash University, Australia;Clayton School of Information Technology, Monash University, Australia;Departament de Matemàtica Aplicada i Anàlisi, Universitat de Barcelona, Spain;Clayton School of Information Technology, Monash University, Australia;Clayton School of Information Technology, Monash University, Australia;Clayton School of Information Technology, Monash University, Australia

  • Venue:
  • ECML'06 Proceedings of the 17th European conference on Machine Learning
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

An ensemble of Super-Parent-One-Dependence Estimators (SPODEs) offers a powerful yet simple alternative to naive Bayes classifiers, achieving significantly higher classification accuracy at a moderate cost in classification efficiency. Currently there exist two families of methodologies that ensemble candidate SPODEs for classification. One is to select only helpful SPODEs and uniformly average their probability estimates, a type of model selection. Another is to assign a weight to each SPODE and linearly combine their probability estimates, a methodology named model weighing. This paper presents a theoretical and empirical study comparing model selection and model weighing for ensembling SPODEs. The focus is on maximizing the ensemble's classification accuracy while minimizing its computational time. A number of representative selection and weighing schemes are studied, providing a comprehensive research on this topic and identifying effective schemes that provide alternative trades-off between speed and expected error.