On the diversity-performance relationship for majority voting in classifier ensembles

  • Authors:
  • Yun-Sheng Chung;D. Frank Hsu;Chuan Yi Tang

  • Affiliations:
  • Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan;Department of Computer and Information Sciences, Fordham University, New York, NY;Department of Computer Science, National Tsing Hua University, Hsinchu, Taiwan

  • Venue:
  • MCS'07 Proceedings of the 7th international conference on Multiple classifier systems
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Combining multiple classifier systems (MCS') has been shown to outperform single classifier system. It has been demonstrated that improvement for ensemble performance depends on either the diversity among or the performance of individual systems. A variety of diversity measures and ensemble methods have been proposed and studied. It remains a challenging problem to estimate the ensemble performance in terms of the performance of and the diversity among individual systems. In this paper, we establish upper and lower bounds for Pm (performance of the ensemble using majority voting) in terms of P (average performance of individual systems) and D (average entropy diversity measure among individual systems). These bounds are shown to be tight using the concept of a performance distribution pattern (PDP) for the input set. Moreover, we showed that when P is big enough, the ensemble performance Pm resulting from a maximum (information-theoretic) entropy PDP is an increasing function with respect to the diversity measure D. Five experiments using data sets from various applications domains are conducted to demonstrate the complexity, richness, and diverseness of the problem in estimating the ensemble performance.