Tree ensembles for predicting structured outputs

  • Authors:
  • Dragi Kocev;Celine Vens;Jan Struyf;SašO Deroski

  • Affiliations:
  • Department of Knowledge Technologies, Joef Stefan Institute, Jamova Cesta 39, 1000 Ljubljana, Slovenia;Department of Computer Science, Katholieke Universiteit Leuven, Celestijnenlaan 200A, 3001 Leuven, Belgium;Department of Computer Science, Katholieke Universiteit Leuven, Celestijnenlaan 200A, 3001 Leuven, Belgium;Department of Knowledge Technologies, Joef Stefan Institute, Jamova Cesta 39, 1000 Ljubljana, Slovenia and International Postgraduate School Joef Stefan, Jamova Cesta 39, 1000 Ljubljana, Slovenia ...

  • Venue:
  • Pattern Recognition
  • Year:
  • 2013

Quantified Score

Hi-index 0.01

Visualization

Abstract

In this paper, we address the task of learning models for predicting structured outputs. We consider both global and local predictions of structured outputs, the former based on a single model that predicts the entire output structure and the latter based on a collection of models, each predicting a component of the output structure. We use ensemble methods and apply them in the context of predicting structured outputs. We propose to build ensemble models consisting of predictive clustering trees, which generalize classification trees: these have been used for predicting different types of structured outputs, both locally and globally. More specifically, we develop methods for learning two types of ensembles (bagging and random forests) of predictive clustering trees for global and local predictions of different types of structured outputs. The types of outputs considered correspond to different predictive modeling tasks: multi-target regression, multi-target classification, and hierarchical multi-label classification. Each of the combinations can be applied both in the context of global prediction (producing a single ensemble) or local prediction (producing a collection of ensembles). We conduct an extensive experimental evaluation across a range of benchmark datasets for each of the three types of structured outputs. We compare ensembles for global and local prediction, as well as single trees for global prediction and tree collections for local prediction, both in terms of predictive performance and in terms of efficiency (running times and model complexity). The results show that both global and local tree ensembles perform better than the single model counterparts in terms of predictive power. Global and local tree ensembles perform equally well, with global ensembles being more efficient and producing smaller models, as well as needing fewer trees in the ensemble to achieve the maximal performance.