The value of agreement a new boosting algorithm

Authors:
Boaz Leskes;Leen Torenvliet
Affiliations:
University of Amsterdam, Department of Computer Science, Plantage Muidergracht 24, 1018 TV Amsterdam, Netherlands;University of Amsterdam, Department of Computer Science, Plantage Muidergracht 24, 1018 TV Amsterdam, Netherlands
Venue:
Journal of Computer and System Sciences
Year:
2008

Citing 14
Cited 2

Learnability and the Vapnik-Chervonenkis dimension

Journal of the ACM (JACM)
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Combining labeled and unlabeled data with co-training

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Boosting in the limit: maximizing the margin of learned ensembles

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Learning to classify text from labeled and unlabeled documents

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Analyzing the effectiveness and applicability of co-training

Proceedings of the ninth international conference on Information and knowledge management
Boosting the margin: A new explanation for the effectiveness of voting methods

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Improving Short-Text Classification using Unlabeled Data for Classification Problems

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Enhancing Supervised Learning with Unlabeled Data

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Exploiting unlabeled data in ensemble methods

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
An introduction to boosting and leveraging

Advanced lectures on machine learning
Rademacher and gaussian complexities: risk bounds and structural results

The Journal of Machine Learning Research
Unsupervised Improvement of Visual Detectors using Co-Training

ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Co-trained support vector machines for large scale unstructured document classification using unlabeled data and syntactic information

Information Processing and Management: an International Journal

Robust multi-view boosting with priors

ECCV'10 Proceedings of the 11th European conference on computer vision conference on Computer vision: Part III
Disagreement-Based multi-system tracking

ACCV'12 Proceedings of the 11th international conference on Computer Vision - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the past few years unlabeled examples and their potential advantage have received a lot of attention. In this paper a new boosting algorithm is presented where unlabeled examples are used to enforce agreement between several different learning algorithms. Not only do the learning algorithms learn from the given training set but they are supposed to do so while agreeing on the unlabeled examples. Similar ideas have been proposed before (for example, the Co-Training algorithm by Mitchell and Blum), but without a proof or under strong assumptions. In our setting, it is only assumed that all learning algorithms are equally adequate for the tasks. A new generalization bound is presented where the use of unlabeled examples results in a better ratio between training-set size and the resulting classifier's quality and thus reduce the number of labeled examples necessary for achieving it. The extent of this improvement depends on the diversity of the learners-a more diverse group of learners will result in a larger improvement whereas using two copies of a single algorithm gives no advantage at all. As a proof of concept, the algorithm, named Agreement Boost, is applied to two test problems. In both cases, using Agreement Boost results in an up to 40% reduction in the number of labeled examples.