Prediction games and arcing algorithms
Neural Computation
Machine Learning
Ensembling neural networks: many could be better than all
Artificial Intelligence
How boosting the margin can also boost classifier complexity
ICML '06 Proceedings of the 23rd international conference on Machine learning
Totally corrective boosting algorithms that maximize the margin
ICML '06 Proceedings of the 23rd international conference on Machine learning
An analysis of diversity measures
Machine Learning
Efficient Margin Maximizing with Boosting
The Journal of Machine Learning Research
Ensemble Pruning Via Semi-definite Programming
The Journal of Machine Learning Research
An Analysis of Ensemble Pruning Techniques Based on Ordered Aggregation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Selective Ensemble under Regularization Framework
MCS '09 Proceedings of the 8th International Workshop on Multiple Classifier Systems
Statistical Instance-Based Ensemble Pruning for Multi-class Problems
ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part I
Applications of Supervised and Unsupervised Ensemble Methods
Applications of Supervised and Unsupervised Ensemble Methods
Boosting through optimization of margin distributions
IEEE Transactions on Neural Networks
Ensemble pruning via individual contribution ordering
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Sparse ensembles using weighted combination methods based on linear programming
Pattern Recognition
An algorithm for pruning redundant modules in min-max modular network with GZC function
ICNC'05 Proceedings of the First international conference on Advances in Natural Computation - Volume Part I
A double pruning algorithm for classification ensembles
MCS'10 Proceedings of the 9th international conference on Multiple Classifier Systems
Hi-index | 0.01 |
This article introduces a margin optimization based pruning algorithm which is able to reduce the ensemble size and improve the performance of a random forest. A key element of the proposed algorithm is that it directly takes into account the margin distribution of the random forest model on the training set. Four different metrics based on the margin distribution are used to evaluate the generalization ability of subensembles and the importance of individual classification trees in an ensemble. After a forest is built, the trees in the ensemble are first ranked according to the margin metrics and subensembles with decreasing sizes are then built by recursively removing the least important trees one by one. Experiments on 10 benchmark datasets demonstrate that our proposed algorithm can significantly improve the generalization performance while reducing the ensemble size at the same time. Furthermore, empirical comparison with other pruning methods indicates that the margin distribution plays an important role in evaluating the performance of a random forest, and can be directly used to select the near-optimal subensembles.