Learning with ensembles of randomized trees: new insights

Authors:
Vincent Pisetta;Pierre-Emmanuel Jouve;Djamel A. Zighed
Affiliations:
Rithme, Lyon;Fenics, Lyon;ERIC Laboratory, Bron
Venue:
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
Year:
2010

Citing 16
Cited 1

A Further Comparison of Splitting Rules for Decision-Tree Induction

Machine Learning
Original Contribution: Stacked generalization

Neural Networks
The nature of statistical learning theory

The nature of statistical learning theory
Bagging predictors

Machine Learning
On the Algorithmic Implementation of Stochastic Discrimination

IEEE Transactions on Pattern Analysis and Machine Intelligence
An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization

Machine Learning
Soft Margins for AdaBoost

Machine Learning
Random Forests

Machine Learning
C4.5: Programs for Machine Learning

C4.5: Programs for Machine Learning
Linear Programming Boosting via Column Generation

Machine Learning
Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy

Machine Learning
Boosting as a Regularized Path to a Maximum Margin Classifier

The Journal of Machine Learning Research
Extremely randomized trees

Machine Learning
Support Vector Machinery for Infinite Ensemble Learning

The Journal of Machine Learning Research
Consistency of Random Forests and Other Averaging Classifiers

The Journal of Machine Learning Research
Spectrum of variable-random trees

Journal of Artificial Intelligence Research

On oblique random forests

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

Ensembles of randomized trees such as Random Forests are among the most popular tools used in machine learning and data mining. Such algorithms work by introducing randomness in the induction of several decision trees before employing a voting scheme to give a prediction for unseen instances. In this paper, randomized trees ensembles are studied in the point of view of the basis functions they induce. We point out a connection with kernel target alignment, a measure of kernel quality, which suggests that randomization is a way to obtain a high alignment, leading to possibly low generalization error. The connection also suggests to post-process ensembles with sophisticated linear separators such as Support Vector Machines (SVM). Interestingly, post-processing gives experimentally better performances than a classical majority voting. We finish by comparing those results to an approximate infinite ensemble classifier very similar to the one introduced by Lin and Li. This methodology also shows strong learning abilities, comparable to ensemble post-processing.