Variable randomness in decision tree ensembles

Authors:
Fei Tony Liu;Kai Ming Ting
Affiliations:
Gippsland School of Information Technology, Monash University, Churchill, Australia;Gippsland School of Information Technology, Monash University, Churchill, Australia
Venue:
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Year:
2006

Citing 11
Cited 2

C4.5: programs for machine learning

C4.5: programs for machine learning
Bagging predictors

Machine Learning
Shape quantization and recognition with randomized trees

Neural Computation
The Random Subspace Method for Constructing Decision Forests

IEEE Transactions on Pattern Analysis and Machine Intelligence
An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization

Machine Learning
Random Forests

Machine Learning
Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy

Machine Learning
Is random model better? On its accuracy and efficiency

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Concept learning and the problem of small disjuncts

IJCAI'89 Proceedings of the 11th international joint conference on Artificial intelligence - Volume 1
Maximizing tree diversity by building complete-random decision trees

PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Combinations of weak classifiers

IEEE Transactions on Neural Networks

Spectrum of variable-random trees

Journal of Artificial Intelligence Research
Cost-sensitive classifier evaluation using cost curves

PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose Max-diverse.α, which has a mechanism to control the degrees of randomness in decision tree ensembles. This control gives an ensemble the means to balance the two conflicting functions of a random random ensemble, i.e., the abilities to model non-axis-parallel boundary and eliminate irrelevant features. We find that this control is more sensitive to the one provided by Random Forests. Using progressive training errors, we are able to estimate an appropriate randomness for any given data prior to any predictive tasks. Experiment results show that Max-diverse.α is significantly better than Random Forests and Max-diverse Ensemble, and it is comparable to the state-of-the-art C5 boosting.