Adaptive signal processing algorithms: stability and performance
Adaptive signal processing algorithms: stability and performance
Exponentiated gradient versus gradient descent for linear predictors
Information and Computation
General convergence results for linear discriminant updates
COLT '97 Proceedings of the tenth annual conference on Computational learning theory
Natural gradient works efficiently in learning
Neural Computation
Relative loss bounds for multidimensional regression problems
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
The robustness of the p-norm algorithms
COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Regret bounds for prediction problems
COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Relative Loss Bounds for Multidimensional Regression Problems
Machine Learning
Fundamentals of Artificial Neural Networks
Fundamentals of Artificial Neural Networks
Optimal and Adaptive Signal Processing
Optimal and Adaptive Signal Processing
Feedforward Neural Network Methodology
Feedforward Neural Network Methodology
Approximate solutions to markov decision processes
Approximate solutions to markov decision processes
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Convergence of exponentiated gradient algorithms
IEEE Transactions on Signal Processing
Krylov-proportionate adaptive filtering techniques not limited to sparse systems
IEEE Transactions on Signal Processing
Expert mixture methods for adaptive channel equalization
ICANN/ICONIP'03 Proceedings of the 2003 joint international conference on Artificial neural networks and neural information processing
Hi-index | 0.00 |
A family of gradient descent algorithms for learning linear functions in an online setting is considered. The family includes the classical LMS algorithm as well as new variants such as the Exponentiated Gradient (EG) algorithm due to Kivinen and Warmuth. The algorithms are based on prior distributions defined on the weight space. Techniques from differential geometry are used to develop the algorithms as gradient descent iterations with respect to the natural gradient in the Riemannian structure induced by the prior distribution. The proposed framework subsumes the notion of "link-functions".