Machine Learning
Machine Learning
Hybrid Genetic Algorithms for Feature Selection
IEEE Transactions on Pattern Analysis and Machine Intelligence
Statistical Comparisons of Classifiers over Multiple Data Sets
The Journal of Machine Learning Research
A review of feature selection techniques in bioinformatics
Bioinformatics
The WEKA data mining software: an update
ACM SIGKDD Explorations Newsletter
A comparative assessment of ensemble learning for credit scoring
Expert Systems with Applications: An International Journal
MLMI'11 Proceedings of the Second international conference on Machine learning in medical imaging
Keyword Annotation of Medical Image with Random Forest Classifier and Confidence Assigning
CGIV '11 Proceedings of the 2011 Eighth International Conference Computer Graphics, Imaging and Visualization
How many trees in a random forest?
MLDM'12 Proceedings of the 8th international conference on Machine Learning and Data Mining in Pattern Recognition
Hi-index | 0.00 |
Random Forest is a computationally efficient technique that can operate quickly over large datasets. It has been used in many recent research projects and real-world applications in diverse domains. However, the associated literature provides few information about what happens in the trees within a Random Forest. The research reported here analyzes the frequency that an attribute appears in the root node in a Random Forest in order to find out if it uses all attributes with equal frequency or if there is some of them most used. Additionally, we have also analyzed the estimated out-of-bag error of the trees aiming to check if the most used attributes present a good performance. Furthermore, we have analyzed if the use of pre-pruning could influence the performance of the Random Forest using out-of-bag errors. Our main conclusions are that the frequency of the attributes in the root node has an exponential behavior. In addition, the use of the estimated out-of-bag error can help to find relevant attributes within the forest. Concerning to the use of pre-pruning, it was observed the execution time can be faster, without significant loss of performance.