The Strength of Weak Learnability
Machine Learning
Boosting a weak learning algorithm by majority
COLT '90 Proceedings of the third annual workshop on Computational learning theory
Machine Learning
Machine Learning
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
A Comparison of Decision Tree Ensemble Creation Techniques
IEEE Transactions on Pattern Analysis and Machine Intelligence
Out-of-bag estimation of the optimal sample size in bagging
Pattern Recognition
Is bagging effective in the classification of small-sample genomic and proteomic data?
EURASIP Journal on Bioinformatics and Systems Biology - Special issue on applications of signal procesing techniques to bioinformatics, genomics, and proteomics
Exact performance of error estimators for discrete classifiers
Pattern Recognition
AI'07 Proceedings of the 20th Australian joint conference on Advances in artificial intelligence
Hi-index | 0.00 |
Application of ensemble classification rules in genomics and proteomics has become increasingly common. However, the problem of error estimation for these classification rules, particularly for bagging under the small-sample settings prevalent in genomics and proteomics, is not well understood. Breiman proposed the "out-of-bag" method for estimating statistics of bagged classifiers, which was subsequently applied by other authors to estimate the classification error. In this paper, we give an explicit definition of the out-of-bag estimator that is intended to remove estimator bias, by formulating carefully how the error count is normalized. We also report the results of an extensive simulation study of bagging of common classification rules, including LDA, 3NN, and CART, applied on both synthetic and real patient data, corresponding to the use of common error estimators such as resubstitution, leave-one-out, cross-validation, basic bootstrap, bootstrap 632, bootstrap 632 plus, bolstering, semi-bolstering, in addition to the out-of-bag estimator. The results from the numerical experiments indicated that the performance of the out-of-bag estimator is very similar to that of leave-one-out; in particular, the out-of-bag estimator is slightly pessimistically biased. The performance of the other estimators is consistent with their performance with the corresponding single classifiers, as reported in other studies.