Convergence of exponentiated gradient algorithms

Authors:
S.I. Hill;R.C. Williamson
Affiliations:
Res. Sch. of Inf. Sci. & Eng., Australian Nat. Univ., Canberra, ACT;-
Venue:
IEEE Transactions on Signal Processing
Year:
2001

Citing 0
Cited 4

Prior knowledge and preferential structures in gradient descent learning algorithms

The Journal of Machine Learning Research
Adaptive algorithms for sparse echo cancellation

Signal Processing
Krylov-proportionate adaptive filtering techniques not limited to sparse systems

IEEE Transactions on Signal Processing
A novel fuzzy approximator with fast terminal sliding mode and its application

FSKD'06 Proceedings of the Third international conference on Fuzzy Systems and Knowledge Discovery

Quantified Score

Hi-index	35.69

Visualization

Abstract

This paper studies three related algorithms: the (traditional) gradient descent (GD) algorithm, the exponentiated gradient algorithm with positive and negative weights (EG± algorithm), and the exponentiated gradient algorithm with unnormalized positive and negative weights (EGU± algorithm). These algorithms have been previously analyzed using the “mistake-bound framework” in the computational learning theory community. We perform a traditional signal processing analysis in terms of the mean square error. A relationship between the learning rate and the mean squared error (MSE) of predictions is found for the family of algorithms. This is used to compare the performance of the algorithms by choosing learning rates such that they converge to the same steady-state MSE. We demonstrate that if the target weight vector is sparse, the EG± algorithm typically converges more quickly than the GD or EGU± algorithms that perform very similarly. A side effect of our analysis is a reparametrization of the algorithms that provides insights into their behavior. The general form of the results we obtain are consistent with those obtained in the mistake-bound framework. The application of the algorithms to acoustic echo cancellation is then studied, and it is shown in some circumstances that the EG± algorithm will converge faster than the other two algorithms