Logistic Regression, AdaBoost and Bregman Distances

Authors:
Michael Collins;Robert E. Schapire;Yoram Singer
Affiliations:
AT&T Labs—Research, Shannon Laboratory, 180 Park Avenue, Florham Park, NJ 07932, USA. mcollins@research.att.com;AT&T Labs—Research, Shannon Laboratory, 180 Park Avenue, Florham Park, NJ 07932, USA. schapire@research.att.com;School of Computer Science & Engineering, Hebrew University, Jerusalem 91904, Israel. singer@cs.huji.ac.il
Venue:
Machine Learning
Year:
2002

Citing 16
Cited 72

Robust trainability of single neurons

COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
On-line learning of linear functions

Computational Complexity
A maximum entropy approach to natural language processing

Computational Linguistics
Inducing Features of Random Fields

IEEE Transactions on Pattern Analysis and Machine Intelligence
Exponentiated gradient versus gradient descent for linear predictors

Information and Computation
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Additive models, boosting, and inference for generalized divergences

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Boosting as entropy projection

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
A simple, fast, and effective rule learner

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Prediction games and arcing algorithms

Neural Computation
Improved Boosting Algorithms Using Confidence-rated Predictions

Machine Learning - The Eleventh Annual Conference on computational Learning Theory
BoosTexter: A Boosting-based Systemfor Text Categorization

Machine Learning - Special issue on information retrieval
Soft Margins for AdaBoost

Machine Learning
Relative Loss Bounds for Multidimensional Regression Problems

Machine Learning
Parallel Optimization: Theory, Algorithms and Applications

Parallel Optimization: Theory, Algorithms and Applications
The Alternating Decision Tree Learning Algorithm

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning

ATTac-2001: A Learning, Autonomous Bidding Agent

AAMAS '02 Revised Papers from the Workshop on Agent Mediated Electronic Commerce on Agent-Mediated Electronic Commerce IV, Designing Mechanisms and Systems
Duality for Bregman projections onto translated cones and affine subspaces

Journal of Approximation Theory
An introduction to boosting and leveraging

Advanced lectures on machine learning
Surrogate maximization/minimization algorithms for AdaBoost and the logistic regression model

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Leveraging the margin more carefully

ICML '04 Proceedings of the twenty-first international conference on Machine learning
A maximum entropy approach to species distribution modeling

ICML '04 Proceedings of the twenty-first international conference on Machine learning
The Dynamics of AdaBoost: Cyclic Behavior and Convergence of Margins

The Journal of Machine Learning Research
Sequential conditional Generalized Iterative Scaling

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Automatic photo pop-up

ACM SIGGRAPH 2005 Papers
Boosting-based Transductive Learning for Text Detection

ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
The Synergy Between PAV and AdaBoost

Machine Learning
Discriminative Reranking for Natural Language Parsing

Computational Linguistics
Parameter estimation for statistical parsing models: theory and practice of distribution-free methods

New developments in parsing technology
Aggregating time partitions

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
New Developments in Parsing Technology

Computational Linguistics
Advances in discriminative parsing

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Parallelizing AdaBoost by weights dynamics

Computational Statistics & Data Analysis
Exponentiated gradient algorithms for log-linear structured prediction

Proceedings of the 24th international conference on Machine learning
Recovering Surface Layout from an Image

International Journal of Computer Vision
Extending boosting for large scale spoken language understanding

Machine Learning
Surrogate maximization/minimization algorithms and extensions

Machine Learning
Sketching information divergences

Machine Learning
Surrogate maximization/minimization algorithms and extensions

Machine Learning
Extending boosting for large scale spoken language understanding

Machine Learning
Boosting with incomplete information

Proceedings of the 25th international conference on Machine learning
Putting Objects in Perspective

International Journal of Computer Vision
Occlusion Boundaries from Motion: Low-Level Detection and Mid-Level Reasoning

International Journal of Computer Vision
A link mining algorithm for earnings forecast and trading

Data Mining and Knowledge Discovery
Boosting with structural sparsity

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Information theoretic regularization for semi-supervised boosting

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
ODDboost: Incorporating Posterior Estimates into AdaBoost

MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
Decision-theoretic bidding based on learned density models in simultaneous, interacting auctions

Journal of Artificial Intelligence Research
Ensembles of partially trained SWMs with multiplicative updates

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Simple training of dependency parsers via structured boosting

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Computational challenges in parsing by classification

CHSLP '06 Proceedings of the Workshop on Computationally Hard Problems and Joint Inference in Speech and Language Processing
Constituent parsing by classification

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Adaptive fuzzy filtering in a deterministic setting

IEEE Transactions on Fuzzy Systems
Fingerprint classification based on Adaboost learning from singularity features

Pattern Recognition
A review on the combination of binary classifiers in multiclass problems

Artificial Intelligence Review
Learning probabilistic structure to group image edges for object extraction

ICME'09 Proceedings of the 2009 IEEE international conference on Multimedia and Expo
Image-based exploration obstacle avoidance for mobile robot

CCDC'09 Proceedings of the 21st annual international conference on Chinese control and decision conference
Margin-based Ranking and an Equivalence between AdaBoost and RankBoost

The Journal of Machine Learning Research
The P-Norm Push: A Simple Convex Ranking Algorithm that Concentrates at the Top of the List

The Journal of Machine Learning Research
Iterative Scaling and Coordinate Descent Methods for Maximum Entropy Models

The Journal of Machine Learning Research
Sketching information divergences

COLT'07 Proceedings of the 20th annual conference on Learning theory
Early stopping in L2Boosting

Computational Statistics & Data Analysis
Combining coregularization and consensus-based self-training for multilingual text categorization

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Boosting-based system combination for machine translation

ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Detecting ground shadows in outdoor consumer photographs

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part II
Boundary detection using f-measure-, filter- and feature- (F3) boost

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part II
Clustering complex data with group-dependent feature selection

ECCV'10 Proceedings of the 11th European conference on Computer vision: Part VI
Integrating neural networks and logistic regression to underpin hyper-heuristic search

Knowledge-Based Systems
Recovering Occlusion Boundaries from an Image

International Journal of Computer Vision
Advances in boosting

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
On Equivalence Relationships Between Classification and Ranking Algorithms

The Journal of Machine Learning Research
Boltzmann machine learning with the latent maximum entropy principle

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Maximum entropy distribution estimation with generalized regularization

COLT'06 Proceedings of the 19th annual conference on Learning Theory
Ranking with a p-norm push

COLT'06 Proceedings of the 19th annual conference on Learning Theory
Loss bounds for online category ranking

COLT'05 Proceedings of the 18th annual conference on Learning Theory
Margin-Based ranking meets boosting in the middle

COLT'05 Proceedings of the 18th annual conference on Learning Theory
Pedestrian detection in images via cascaded L1-norm minimization learning method

Pattern Recognition
A primal-dual convergence analysis of boosting

The Journal of Machine Learning Research
Estimating the Natural Illumination Conditions from a Single Outdoor Image

International Journal of Computer Vision
Approximate bregman near neighbors in sublinear time: beyond the triangle inequality

Proceedings of the twenty-eighth annual symposium on Computational geometry
The Latent Maximum Entropy Principle

ACM Transactions on Knowledge Discovery from Data (TKDD)
A noise-detection based AdaBoost algorithm for mislabeled data

Pattern Recognition
Improvements to adaboost dynamic

Canadian AI'12 Proceedings of the 25th Canadian conference on Advances in Artificial Intelligence
Peer-to-peer multi-class boosting

Euro-Par'12 Proceedings of the 18th international conference on Parallel Processing
Review: Divergence measures for statistical data processing-An annotated bibliography

Signal Processing
The rate of convergence of AdaBoost

The Journal of Machine Learning Research
Algorithms and hardness results for parallel large margin learning

The Journal of Machine Learning Research
Active learning for on-road vehicle detection: a comparative study

Machine Vision and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

We give a unified account of boosting and logistic regression in which each learning problem is cast in terms of optimization of Bregman distances. The striking similarity of the two problems in this framework allows us to design and analyze algorithms for both simultaneously, and to easily adapt algorithms designed for one problem to the other. For both problems, we give new algorithms and explain their potential advantages over existing methods. These algorithms are iterative and can be divided into two types based on whether the parameters are updated sequentially (one at a time) or in parallel (all at once). We also describe a parameterized family of algorithms that includes both a sequential- and a parallel-update algorithm as special cases, thus showing how the sequential and parallel approaches can themselves be unified. For all of the algorithms, we give convergence proofs using a general formalization of the auxiliary-function proof technique. As one of our sequential-update algorithms is equivalent to AdaBoost, this provides the first general proof of convergence for AdaBoost. We show that all of our algorithms generalize easily to the multiclass case, and we contrast the new algorithms with the iterative scaling algorithm. We conclude with a few experimental results with synthetic data that highlight the behavior of the old and newly proposed algorithms in different settings.