Improving regression estimation: Averaging methods for variance reduction with extensions to general convex measure optimization
Machine Learning
Averaging regularized estimators
Neural Computation
Boosting in the limit: maximizing the margin of learned ensembles
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Improved Boosting Algorithms Using Confidence-rated Predictions
Machine Learning - The Eleventh Annual Conference on computational Learning Theory
An Adaptive Version of the Boost by Majority Algorithm
Machine Learning
Analysis of new techniques to obtain quality training sets
Pattern Recognition Letters - Special issue: Sibgrapi 2001
Boosting the margin: A new explanation for the effectiveness of voting methods
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
A Unified Bias-Variance Decomposition for Zero-One and Squared Loss
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
MadaBoost: A Modification of AdaBoost
COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
Generalization error estimates and training data valuation
Generalization error estimates and training data valuation
Identifying and Handling Mislabelled Instances
Journal of Intelligent Information Systems
Leveraging the margin more carefully
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Pruning Training Sets for Learning of Object Categories
CVPR '05 Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Volume 1 - Volume 01
Robust boosting and its relation to bagging
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Robustifying AdaBoost by Adding the Naive Error Rate
Neural Computation
How boosting the margin can also boost classifier complexity
ICML '06 Proceedings of the 23rd international conference on Machine learning
Hit Miss Networks with Applications to Instance Selection
The Journal of Machine Learning Research
ODDboost: Incorporating Posterior Estimates into AdaBoost
MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
Graph-Based Discrete Differential Geometry for Critical Instance Filtering
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Edited AdaBoost by weighted kNN
Neurocomputing
Learning Multi-modal Similarity
The Journal of Machine Learning Research
Profiling instances in noise reduction
Knowledge-Based Systems
Reducing overfitting of AdaBoost by clustering-based pruning of hard examples
Proceedings of the 7th International Conference on Ubiquitous Information Management and Communication
International Journal of Multimedia Data Engineering & Management
Hi-index | 0.00 |
Boosting methods are known to exhibit noticeable overfitting on some datasets, while being immune to overfitting on other ones. In this paper we show that standard boosting algorithms are not appropriate in case of overlapping classes. This inadequateness is likely to be the major source of boosting overfitting while working with real world data. To verify our conclusion we use the fact that any overlapping classes' task can be reduced to a deterministic task with the same Bayesian separating surface. This can be done by removing "confusing samples" --- samples that are misclassified by a "perfect" Bayesian classifier. We propose an algorithm for removing confusing samples and experimentally study behavior of AdaBoost trained on the resulting data sets. Experiments confirm that removing confusing samples helps boosting to reduce the generalization error and to avoid overfitting on both synthetic and real world. Process of removing confusing samples also provides an accurate error prediction based on the work with the training sets.