The Robustness of the p-Norm Algorithms

Authors:
Claudio Gentile
Affiliations:
CRII, Università dell'Insubria, Via Ravasi, 2, 21100—Varese, Italy. gentile@dsi.unimi.it
Venue:
Machine Learning
Year:
2003

Citing 31
Cited 24

Mistake bounds and logarithmic linear-threshold learning algorithms

Mistake bounds and logarithmic linear-threshold learning algorithms
Aggregating strategies

COLT '90 Proceedings of the third annual workshop on Computational learning theory
From on-line to batch learning

COLT '89 Proceedings of the second annual workshop on Computational learning theory
Redundant noisy attributes, attribute errors, and linear-threshold learning using winnow

COLT '91 Proceedings of the fourth annual workshop on Computational learning theory
The weighted majority algorithm

Information and Computation
On weak learning

Journal of Computer and System Sciences
On-line prediction and conversion strategies

Machine Learning
Exponentiated gradient versus gradient descent for linear predictors

Information and Computation
Predicting Nearly As Well As the Best Pruning of a Decision Tree

Machine Learning - Special issue on the eighth annual conference on computational learning theory, (COLT '95)
How to use expert advice

Journal of the ACM (JACM)
The binary exponentiated gradient algorithm for learning linear functions

COLT '97 Proceedings of the tenth annual conference on Computational learning theory
Tracking the best regressor

COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Tracking the Best Disjunction

Machine Learning - Special issue on context sensitivity and concept drift
Tracking the Best Expert

Machine Learning - Special issue on context sensitivity and concept drift
Competitive on-line linear regression

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
The robustness of the p-norm algorithms

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Regret bounds for prediction problems

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Large Margin Classification Using the Perceptron Algorithm

Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Linear hinge loss and average margin

Proceedings of the 1998 conference on Advances in neural information processing systems II
Relative Loss Bounds for Multidimensional Regression Problems

Machine Learning
General Convergence Results for Linear Discriminant Updates

Machine Learning
Relative Loss Bounds for On-Line Density Estimation with the Exponential Family of Distributions

Machine Learning
Predicting nearly as well as the best pruning of a planar decision graph

Theoretical Computer Science
Queries and Concept Learning

Machine Learning
Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm

Machine Learning
Queries and Concept Learning

Machine Learning
Adaptive and Self-Confident On-Line Learning Algorithms

COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
Relative Expected Instantaneous Loss Bounds

COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
A decision-theoretic extension of stochastic complexity and its applications to learning

IEEE Transactions on Information Theory
Worst-case quadratic loss bounds for prediction using linear functions and gradient descent

IEEE Transactions on Neural Networks
Relative loss bounds for single neurons

IEEE Transactions on Neural Networks

Online multiclass learning by interclass hypothesis sharing

ICML '06 Proceedings of the 23rd international conference on Machine learning
Online Passive-Aggressive Algorithms

The Journal of Machine Learning Research
Worst-Case Analysis of Selective Sampling for Linear Classification

The Journal of Machine Learning Research
Applications of regularized least squares to pattern classification

Theoretical Computer Science
Tracking the best hyperplane with a simple budget Perceptron

Machine Learning
A primal-dual perspective of online learning algorithms

Machine Learning
Learning to assign degrees of belief in relational domains

Machine Learning
Mixed Bregman Clustering with Approximation Guarantees

ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Exploiting Cluster-Structure to Predict the Labeling of a Graph

ALT '08 Proceedings of the 19th international conference on Algorithmic Learning Theory
Stochastic methods for l1 regularized loss minimization

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Adaptive fuzzy filtering in a deterministic setting

IEEE Transactions on Fuzzy Systems
Individual sequence prediction using memory-efficient context trees

IEEE Transactions on Information Theory
Bounded Kernel-Based Online Learning

The Journal of Machine Learning Research
Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization

The Journal of Machine Learning Research
Linear Algorithms for Online Multitask Classification

The Journal of Machine Learning Research
Stochastic Methods for l1-regularized Loss Minimization

The Journal of Machine Learning Research
Ensembles and multiple classifiers: a game-theoretic view

MCS'11 Proceedings of the 10th international conference on Multiple classifier systems
Adaptive and optimal online linear regression on l1-balls

ALT'11 Proceedings of the 22nd international conference on Algorithmic learning theory
Online learning meets optimization in the dual

COLT'06 Proceedings of the 19th annual conference on Learning Theory
Tracking the best hyperplane with a simple budget perceptron

COLT'06 Proceedings of the 19th annual conference on Learning Theory
Online Learning and Online Convex Optimization

Foundations and Trends® in Machine Learning
Regularization techniques for learning with matrices

The Journal of Machine Learning Research
Adaptive regularization of weight vectors

Machine Learning
Adaptive and optimal online linear regression on ℓ1-balls

Theoretical Computer Science

Quantified Score

Hi-index	0.06

Visualization

Abstract

We consider two on-line learning frameworks: binary classification through linear threshold functions and linear regression. We study a family of on-line algorithms, called p-norm algorithms, introduced by Grove, Littlestone and Schuurmans in the context of deterministic binary classification. We show how to adapt these algorithms for use in the regression setting, and prove worst-case bounds on the square loss, using a technique from Kivinen and Warmuth. As pointed out by Grove, et al., these algorithms can be made to approach a version of the classification algorithm Winnow as p goes to infinity; similarly they can be made to approach the corresponding regression algorithm EG in the limit. Winnow and EG are notable for having loss bounds that grow only logarithmically in the dimension of the instance space. Here we describe another way to use the p-norm algorithms to achieve this logarithmic behavior. With the way to use them that we propose, it is less critical than with Winnow and EG to retune the parameters of the algorithm as the learning task changes. Since the correct setting of the parameters depends on characteristics of the learning task that are not typically known a priori by the learner, this gives the p-norm algorithms a desireable robustness. Our elaborations yield various new loss bounds in these on-line settings. Some of these bounds improve or generalize known results. Others are incomparable.