Limiting the Number of Trees in Random Forests

Authors:
Patrice Latinne;Olivier Debeir;Christine Decaestecker
Affiliations:
-;-;-
Venue:
MCS '01 Proceedings of the Second International Workshop on Multiple Classifier Systems
Year:
2001

Citing 15
Cited 8

The Strength of Weak Learnability

Machine Learning
C4.5: programs for machine learning

C4.5: programs for machine learning
Bagging predictors

Machine Learning
Error reduction through learning multiple descriptions

Machine Learning
The Random Subspace Method for Constructing Decision Forests

IEEE Transactions on Pattern Analysis and Machine Intelligence
Approximate statistical tests for comparing supervised classification learning algorithms

Neural Computation
An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization

Machine Learning
An approach to the automatic design of multiple classifier systems

Pattern Recognition Letters - Special issue on machine learning and data mining in pattern recognition
On Comparing Classifiers: Pitfalls toAvoid and a Recommended Approach

Data Mining and Knowledge Discovery
Combining Nearest Neighbor Classifiers Through Multiple Feature Subsets

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Different Ways of Weakening Decision Trees and Their Impact on Classification Accuracy of DT Combination

MCS '00 Proceedings of the First International Workshop on Multiple Classifier Systems
Option Decision Trees with Majority Votes

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Generating Classifier Commitees by Stochastically Selecting both Attributes and Training Examples

PRICAI '98 Proceedings of the 5th Pacific Rim International Conference on Artificial Intelligence: Topics in Artificial Intelligence
Bagging, boosting, and C4.S

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Combinations of weak classifiers

IEEE Transactions on Neural Networks

Forest-RK: A New Random Forest Induction Method

ICIC '08 Proceedings of the 4th international conference on Intelligent Computing: Advanced Intelligent Computing Theories and Applications - with Aspects of Artificial Intelligence
A Study of Random Linear Oracle Ensembles

MCS '09 Proceedings of the 8th International Workshop on Multiple Classifier Systems
Thinned-ECOC ensemble based on sequential code shrinking

Expert Systems with Applications: An International Journal
Dynamics of variance reduction in bagging and other techniques based on randomisation

MCS'05 Proceedings of the 6th international conference on Multiple Classifier Systems
A double pruning algorithm for classification ensembles

MCS'10 Proceedings of the 9th international conference on Multiple Classifier Systems
Dynamic Random Forests

Pattern Recognition Letters
How many trees in a random forest?

MLDM'12 Proceedings of the 8th international conference on Machine Learning and Data Mining in Pattern Recognition
How large should ensembles of classifiers be?

Pattern Recognition

Quantified Score

Hi-index	0.00

Visualization

Abstract

The aim of this paper is to propose a simple procedure that a priori determines a minimum number of classifiers to combine in order to obtain a prediction accuracy level similar to the one obtained with the combination of larger ensembles. The procedure is based on the McNemar non-parametric test of significance. Knowing a priori the minimum size of the classifier ensemble giving the best prediction accuracy, constitutes a gain for time and memory costs especially for huge data bases and real-time applications. Here we applied this procedure to four multiple classifier systems with C4.5 decision tree (Breiman's Bagging, Ho's Random subspaces, their combination we labeled 'Bagfs', and Breiman's Random forests) and five large benchmark data bases. It is worth noticing that the proposed procedure may easily be extended to other base learning algorithms than a decision tree as well. The experimental results showed that it is possible to limit significantly the number of trees. We also showed that the minimum number of trees required for obtaining the best prediction accuracy may vary from one classifier combination method to another.