General Convergence Results for Linear Discriminant Updates

Authors:
Adam J. Grove;Nick Littlestone;Dale Schuurmans
Affiliations:
NEC Research Institute, 4 Independence Way, Princeton NJ 08540, USA. grove@pobox.com;NEC Research Institute, 4 Independence Way, Princeton NJ 08540, USA. nlittlestone@mindspring.com;NEC Research Institute, 4 Independence Way, Princeton NJ 08540, USA. dale@cs.uwaterloo.ca
Venue:
Machine Learning
Year:
2001

Citing 18
Cited 30

Mistake bounds and logarithmic linear-threshold learning algorithms

Mistake bounds and logarithmic linear-threshold learning algorithms
Aggregating strategies

COLT '90 Proceedings of the third annual workshop on Computational learning theory
Redundant noisy attributes, attribute errors, and linear-threshold learning using winnow

COLT '91 Proceedings of the fourth annual workshop on Computational learning theory
On Bayes methods for on-line Boolean prediction

COLT '96 Proceedings of the ninth annual conference on Computational learning theory
Empirical Support for Winnow and Weighted-MajorityAlgorithms: Results on a Calendar Scheduling Domain

Machine Learning
Exponentiated gradient versus gradient descent for linear predictors

Information and Computation
How to use expert advice

Journal of the ACM (JACM)
General convergence results for linear discriminant updates

COLT '97 Proceedings of the tenth annual conference on Computational learning theory
The Perceptron algorithm versus Winnow: linear versus logarithmic mistake bounds when few input variables are relevant

Artificial Intelligence - Special issue on relevance
Relative loss bounds for multidimensional regression problems

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
The robustness of the p-norm algorithms

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
A Winnow-Based Approach to Context-Sensitive Spelling Correction

Machine Learning - Special issue on natural language learning
Linear hinge loss and average margin

Proceedings of the 1998 conference on Advances in neural information processing systems II
Parallel Optimization: Theory, Algorithms and Applications

Parallel Optimization: Theory, Algorithms and Applications
Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm

Machine Learning
Relational Learning for NLP using Linear Threshold Elements

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Relative loss bounds for on-line density estimation with the exponential family of distributions

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Relative loss bounds for single neurons

IEEE Transactions on Neural Networks

Direct and indirect algorithms for on-line learning of disjunctions

Theoretical Computer Science
Potential-Based Algorithms in On-Line Prediction and Game Theory

Machine Learning
Uncertainty-Based Noise Reduction and Term Selection in Text Categorization

Proceedings of the 24th BCS-IRSG European Colloquium on IR Research: Advances in Information Retrieval
Evidence that Incremental Delta-Bar-Delta Is an Attribute-Efficient Linear Learner

ECML '02 Proceedings of the 13th European Conference on Machine Learning
Large Margin Classification for Moving Targets

ALT '02 Proceedings of the 13th International Conference on Algorithmic Learning Theory
A Second-Order Perceptron Algorithm

COLT '02 Proceedings of the 15th Annual Conference on Computational Learning Theory
Uncertainty and term selection in text categorization

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Online learning of linear classifiers

Advanced lectures on machine learning
Tracking the best linear predictor

The Journal of Machine Learning Research
Covering number bounds of certain regularized linear function classes

The Journal of Machine Learning Research
A new approximate maximal margin classification algorithm

The Journal of Machine Learning Research
Ultraconservative online algorithms for multiclass problems

The Journal of Machine Learning Research
The Robustness of the p-Norm Algorithms

Machine Learning
Tracking linear-threshold concepts with Winnow

The Journal of Machine Learning Research
Patent document categorization based on semantic structural information

Information Processing and Management: an International Journal
Worst-Case Analysis of Selective Sampling for Linear Classification

The Journal of Machine Learning Research
A primal-dual perspective of online learning algorithms

Machine Learning
Learning to assign degrees of belief in relational domains

Machine Learning
Information Geometry and Information Theory in Machine Learning

Neural Information Processing
Adaptive Learning Rate for Online Linear Discriminant Classifiers

SSPR & SPR '08 Proceedings of the 2008 Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
Stochastic methods for l1 regularized loss minimization

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Online learning by ellipsoid method

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Taming wild phrases

ECIR'03 Proceedings of the 25th European conference on IR research
On the importance of parameter tuning in text categorization

PSI'06 Proceedings of the 6th international Andrei Ershov memorial conference on Perspectives of systems informatics
Linear Algorithms for Online Multitask Classification

The Journal of Machine Learning Research
Stochastic Methods for l1-regularized Loss Minimization

The Journal of Machine Learning Research
Online learning meets optimization in the dual

COLT'06 Proceedings of the 19th annual conference on Learning Theory
Tracking the best hyperplane with a simple budget perceptron

COLT'06 Proceedings of the 19th annual conference on Learning Theory
Online Learning and Online Convex Optimization

Foundations and Trends® in Machine Learning
Regularization techniques for learning with matrices

The Journal of Machine Learning Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

The problem of learning linear-discriminant concepts can be solved by various mistake-driven update procedures, including the Winnow family of algorithms and the well-known Perceptron algorithm. In this paper we define the general class of “quasi-additive” algorithms, which includes Perceptron and Winnow as special cases. We give a single proof of convergence that covers a broad subset of algorithms in this class, including both Perceptron and Winnow, but also many new algorithms. Our proof hinges on analyzing a generic measure of progress construction that gives insight as to when and how such algorithms converge.Our measure of progress construction also permits us to obtain good mistake bounds for individual algorithms. We apply our unified analysis to new algorithms as well as existing algorithms. When applied to known algorithms, our method “automatically” produces close variants of existing proofs (recovering similar bounds)—thus showing that, in a certain sense, these seemingly diverse results are fundamentally isomorphic. However, we also demonstrate that the unifying principles are more broadly applicable, and analyze a new class of algorithms that smoothly interpolate between the additive-update behavior of Perceptron and the multiplicative-update behavior of Winnow.