Sharing Classifiers among Ensembles from Related Problem Domains

  • Authors:
  • Yi Zhang;W. Nick Street;Samuel Burer

  • Affiliations:
  • University of Iowa;University of Iowa;University of Iowa

  • Venue:
  • ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

A classification ensemble is a group of classifiers that all solve the same prediction problem in different ways. It is well-known that combining the predictions of classifiers within the same problem domain using techniques like bagging or boosting often improves the performance. This research shows that sharing classifiers among different but closely related problem domains can also be helpful. In addition, a semi-definite programming based ensemble pruning method is implemented in order to optimize the selection of a subset of classifiers for each problem domain. Computational results on a catalog dataset indicate that the ensembles resulting from sharing classifiers among different product categories generally have larger AUCs than those ensembles trained only on their own categories. The pruning algorithm not only prevents the occasional decrease of effectiveness caused by conflicting concepts among the problem domains, but also provides a better understanding of the problem domains and their relationships.