Mistake bounds and logarithmic linear-threshold learning algorithms
Mistake bounds and logarithmic linear-threshold learning algorithms
COLT '90 Proceedings of the third annual workshop on Computational learning theory
From on-line to batch learning
COLT '89 Proceedings of the second annual workshop on Computational learning theory
Redundant noisy attributes, attribute errors, and linear-threshold learning using winnow
COLT '91 Proceedings of the fourth annual workshop on Computational learning theory
The weighted majority algorithm
Information and Computation
Journal of Computer and System Sciences
On-line prediction and conversion strategies
Machine Learning
Exponentiated gradient versus gradient descent for linear predictors
Information and Computation
Predicting Nearly As Well As the Best Pruning of a Decision Tree
Machine Learning - Special issue on the eighth annual conference on computational learning theory, (COLT '95)
Journal of the ACM (JACM)
The binary exponentiated gradient algorithm for learning linear functions
COLT '97 Proceedings of the tenth annual conference on Computational learning theory
COLT' 98 Proceedings of the eleventh annual conference on Computational learning theory
Machine Learning - Special issue on context sensitivity and concept drift
Machine Learning - Special issue on context sensitivity and concept drift
Competitive on-line linear regression
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
The robustness of the p-norm algorithms
COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Regret bounds for prediction problems
COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Large Margin Classification Using the Perceptron Algorithm
Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Linear hinge loss and average margin
Proceedings of the 1998 conference on Advances in neural information processing systems II
Relative Loss Bounds for Multidimensional Regression Problems
Machine Learning
General Convergence Results for Linear Discriminant Updates
Machine Learning
Predicting nearly as well as the best pruning of a planar decision graph
Theoretical Computer Science
Machine Learning
Machine Learning
Adaptive and Self-Confident On-Line Learning Algorithms
COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
Relative Expected Instantaneous Loss Bounds
COLT '00 Proceedings of the Thirteenth Annual Conference on Computational Learning Theory
A decision-theoretic extension of stochastic complexity and its applications to learning
IEEE Transactions on Information Theory
Worst-case quadratic loss bounds for prediction using linear functions and gradient descent
IEEE Transactions on Neural Networks
Relative loss bounds for single neurons
IEEE Transactions on Neural Networks
Online multiclass learning by interclass hypothesis sharing
ICML '06 Proceedings of the 23rd international conference on Machine learning
Online Passive-Aggressive Algorithms
The Journal of Machine Learning Research
Worst-Case Analysis of Selective Sampling for Linear Classification
The Journal of Machine Learning Research
Applications of regularized least squares to pattern classification
Theoretical Computer Science
Tracking the best hyperplane with a simple budget Perceptron
Machine Learning
A primal-dual perspective of online learning algorithms
Machine Learning
Learning to assign degrees of belief in relational domains
Machine Learning
Mixed Bregman Clustering with Approximation Guarantees
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Exploiting Cluster-Structure to Predict the Labeling of a Graph
ALT '08 Proceedings of the 19th international conference on Algorithmic Learning Theory
Stochastic methods for l1 regularized loss minimization
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Adaptive fuzzy filtering in a deterministic setting
IEEE Transactions on Fuzzy Systems
Individual sequence prediction using memory-efficient context trees
IEEE Transactions on Information Theory
Bounded Kernel-Based Online Learning
The Journal of Machine Learning Research
Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization
The Journal of Machine Learning Research
Linear Algorithms for Online Multitask Classification
The Journal of Machine Learning Research
Stochastic Methods for l1-regularized Loss Minimization
The Journal of Machine Learning Research
Ensembles and multiple classifiers: a game-theoretic view
MCS'11 Proceedings of the 10th international conference on Multiple classifier systems
Adaptive and optimal online linear regression on l1-balls
ALT'11 Proceedings of the 22nd international conference on Algorithmic learning theory
Online learning meets optimization in the dual
COLT'06 Proceedings of the 19th annual conference on Learning Theory
Tracking the best hyperplane with a simple budget perceptron
COLT'06 Proceedings of the 19th annual conference on Learning Theory
Online Learning and Online Convex Optimization
Foundations and Trends® in Machine Learning
Regularization techniques for learning with matrices
The Journal of Machine Learning Research
Adaptive regularization of weight vectors
Machine Learning
Adaptive and optimal online linear regression on ℓ1-balls
Theoretical Computer Science
Hi-index | 0.06 |
We consider two on-line learning frameworks: binary classification through linear threshold functions and linear regression. We study a family of on-line algorithms, called p-norm algorithms, introduced by Grove, Littlestone and Schuurmans in the context of deterministic binary classification. We show how to adapt these algorithms for use in the regression setting, and prove worst-case bounds on the square loss, using a technique from Kivinen and Warmuth. As pointed out by Grove, et al., these algorithms can be made to approach a version of the classification algorithm Winnow as p goes to infinity; similarly they can be made to approach the corresponding regression algorithm EG in the limit. Winnow and EG are notable for having loss bounds that grow only logarithmically in the dimension of the instance space. Here we describe another way to use the p-norm algorithms to achieve this logarithmic behavior. With the way to use them that we propose, it is less critical than with Winnow and EG to retune the parameters of the algorithm as the learning task changes. Since the correct setting of the parameters depends on characteristics of the learning task that are not typically known a priori by the learner, this gives the p-norm algorithms a desireable robustness. Our elaborations yield various new loss bounds in these on-line settings. Some of these bounds improve or generalize known results. Others are incomparable.