Tree ensembles for predicting structured outputs

Authors:
Dragi Kocev;Celine Vens;Jan Struyf;SašO Deroski
Affiliations:
Department of Knowledge Technologies, Joef Stefan Institute, Jamova Cesta 39, 1000 Ljubljana, Slovenia;Department of Computer Science, Katholieke Universiteit Leuven, Celestijnenlaan 200A, 3001 Leuven, Belgium;Department of Computer Science, Katholieke Universiteit Leuven, Celestijnenlaan 200A, 3001 Leuven, Belgium;Department of Knowledge Technologies, Joef Stefan Institute, Jamova Cesta 39, 1000 Ljubljana, Slovenia and International Postgraduate School Joef Stefan, Jamova Cesta 39, 1000 Ljubljana, Slovenia ...
Venue:
Pattern Recognition
Year:
2013

Citing 49
Cited 2

C4.5: programs for machine learning

C4.5: programs for machine learning
Elements of machine learning

Elements of machine learning
Bagging predictors

Machine Learning
Multitask Learning

Machine Learning - Special issue on inductive transfer
The Random Subspace Method for Constructing Decision Forests

IEEE Transactions on Pattern Analysis and Machine Intelligence
Learning to learn

Learning to learn
Using Iterated Bagging to Debias Regressions

Machine Learning
Random Forests

Machine Learning
Predicting Chemical Parameters of River Water Quality from Bioindicator Data

Applied Intelligence
An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants

Machine Learning
Neural Network Ensembles

IEEE Transactions on Pattern Analysis and Machine Intelligence
Top-Down Induction of Clustering Trees

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Simultaneous Prediction of Mulriple Chemical Parameters of River Water Quality with TILDE

PKDD '99 Proceedings of the Third European Conference on Principles of Data Mining and Knowledge Discovery
Ensemble Methods in Machine Learning

MCS '00 Proceedings of the First International Workshop on Multiple Classifier Systems
Efficient algorithms for decision tree cross-validation

The Journal of Machine Learning Research
Task clustering and gating for bayesian multitask learning

The Journal of Machine Learning Research
Combining Pattern Classifiers: Methods and Algorithms

Combining Pattern Classifiers: Methods and Algorithms
RCV1: A New Benchmark Collection for Text Categorization Research

The Journal of Machine Learning Research
Learning Multiple Tasks with Kernel Methods

The Journal of Machine Learning Research
Kernelizing the output of tree-based methods

ICML '06 Proceedings of the 23rd international conference on Machine learning
Hierarchical multi-label prediction of gene function

Bioinformatics
A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data

The Journal of Machine Learning Research
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Statistical Comparisons of Classifiers over Multiple Data Sets

The Journal of Machine Learning Research
Kernel-Based Learning of Hierarchical Multilabel Classification Models

The Journal of Machine Learning Research
Multi-task reinforcement learning: a hierarchical Bayesian approach

Proceedings of the 24th international conference on Machine learning
Future trends in data mining

Data Mining and Knowledge Discovery
Predicting Structured Data (Neural Information Processing)

Predicting Structured Data (Neural Information Processing)
Decision trees for hierarchical multi-label classification

Machine Learning
Ensembles of Multi-Objective Decision Trees

ECML '07 Proceedings of the 18th European conference on Machine Learning
Universal Multi-Task Kernels

The Journal of Machine Learning Research
Convex multi-task feature learning

Machine Learning
A notion of task relatedness yielding provable multiple-task learning guarantees

Machine Learning
A systematic analysis of performance measures for classification tasks

Information Processing and Management: an International Journal
Multi-output regression on the output manifold

Pattern Recognition
On structured output training: hard cases and an efficient alternative

Machine Learning
A model of inductive bias learning

Journal of Artificial Intelligence Research
SVM+ regression and multi-task learning

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Rapid and brief communication: Controlling the diversity in classifier ensembles through a measure of agreement

Pattern Recognition
Analysis of time series data with predictive clustering trees

KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases
Towards a general framework for data mining

KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases
A semi-dependent decomposition approach to learn hierarchical classifiers

Pattern Recognition
Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions

Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions
Mining data with random forests: A survey and results of new tests

Pattern Recognition
Expectation propagation for Bayesian multi-task feature selection

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
A survey of hierarchical classification across different application domains

Data Mining and Knowledge Discovery
Multi-task learning to rank for web search

Pattern Recognition Letters
Decision trees for hierarchical multilabel classification: a case study in functional genomics

PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Constraint based induction of multi-objective regression trees

KDID'05 Proceedings of the 4th international conference on Knowledge Discovery in Inductive Databases

Efficient monte carlo methods for multi-dimensional learning with classifier chains

Pattern Recognition
Fast and efficient visual codebook construction for multi-label annotation using predictive clustering trees

Pattern Recognition Letters

Quantified Score

Hi-index	0.01

Visualization

Abstract

In this paper, we address the task of learning models for predicting structured outputs. We consider both global and local predictions of structured outputs, the former based on a single model that predicts the entire output structure and the latter based on a collection of models, each predicting a component of the output structure. We use ensemble methods and apply them in the context of predicting structured outputs. We propose to build ensemble models consisting of predictive clustering trees, which generalize classification trees: these have been used for predicting different types of structured outputs, both locally and globally. More specifically, we develop methods for learning two types of ensembles (bagging and random forests) of predictive clustering trees for global and local predictions of different types of structured outputs. The types of outputs considered correspond to different predictive modeling tasks: multi-target regression, multi-target classification, and hierarchical multi-label classification. Each of the combinations can be applied both in the context of global prediction (producing a single ensemble) or local prediction (producing a collection of ensembles). We conduct an extensive experimental evaluation across a range of benchmark datasets for each of the three types of structured outputs. We compare ensembles for global and local prediction, as well as single trees for global prediction and tree collections for local prediction, both in terms of predictive performance and in terms of efficiency (running times and model complexity). The results show that both global and local tree ensembles perform better than the single model counterparts in terms of predictive power. Global and local tree ensembles perform equally well, with global ensembles being more efficient and producing smaller models, as well as needing fewer trees in the ensemble to achieve the maximal performance.