Additive Groves of Regression Trees

Authors:
Daria Sorokina;Rich Caruana;Mirek Riedewald
Affiliations:
Department of Computer Science, Cornell University, Ithaca, NY, USA;Department of Computer Science, Cornell University, Ithaca, NY, USA;Department of Computer Science, Cornell University, Ithaca, NY, USA
Venue:
ECML '07 Proceedings of the 18th European conference on Machine Learning
Year:
2007

Citing 5
Cited 7

Bagging predictors

Machine Learning
Estimating Generalization Error on Two-Class Datasets Using Out-of-Bag Estimates

Machine Learning
Stochastic gradient boosting

Computational Statistics & Data Analysis - Nonlinear methods and data mining
Inducing Models of human Control Skills

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Discovering additive structure in black box functions

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining

BagBoo: a scalable hybrid bagging-the-boosting model

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Bagging gradient-boosted trees for high precision, low variance ranking models

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Tag recommendation for georeferenced photos

Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Location-Based Social Networks
Semi-random model tree ensembles: an effective and scalable regression method

AI'11 Proceedings of the 24th international conference on Advances in Artificial Intelligence
Intelligible models for classification and regression

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Customer relationship management using partial focus feature reduction

ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part IV
Accurate intelligible models with pairwise interactions

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a new regression algorithm called Groves of trees and show empirically that it is superior in performance to a number of other established regression methods. A Grove is an additive model usually containing a small number of large trees. Trees added to the Grove are trained on the residual error of other trees already in the Grove. We begin the training process with a single small tree in the Grove and gradually increase both the number of trees in the Grove and their size. This procedure ensures that the resulting model captures the additive structure of the response. A single Grove may still overfit to the training set, so we further decrease the variance of the final predictions with bagging. We show that in addition to exhibiting superior performance on a suite of regression test problems, bagged Groves of trees are very resistant to overfitting.