A decision-theoretic generalization of on-line learning and an application to boosting
Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Improved Boosting Algorithms Using Confidence-rated Predictions
Machine Learning - The Eleventh Annual Conference on computational Learning Theory
The Alternating Decision Tree Learning Algorithm
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Sampling-based sequential subgroup mining
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
YALE: rapid prototyping for complex data mining tasks
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Margin-Based ranking meets boosting in the middle
COLT'05 Proceedings of the 18th annual conference on Learning Theory
Hi-index | 0.01 |
This paper analyzes boosting in unscaled versions of ROC spaces, also referred to as PN spaces. A minor revision to AdaBoost 's reweighting strategy is analyzed, which allows to reformulate it in terms of stratification, and to visualize the boosting process in nested PN spaces as known from divide-and-conquer rule learning. The analyzed confidence-rated algorithm is proven to take more advantage of its base models in each iteration, although also searching a space of linear discrete base classifier combinations. The algorithm reduces the training error quicker without lacking any of the advantages of original AdaBoost. The PN space interpretation allows to derive a lower-bound for the area under the ROC curve metric (AUC) of resulting ensembles based on the AUC after reweighting. The theoretical findings of this paper are complemented by an empirical evaluation on benchmark datasets.