EXPONENTIATED GRADIENT VERSUS GRADIENT DESCENT FOR LINEAR PREDICTORS

  • Authors:
  • Jyrki Kivinen;Manfred Warmuth

  • Affiliations:
  • -;-

  • Venue:
  • EXPONENTIATED GRADIENT VERSUS GRADIENT DESCENT FOR LINEAR PREDICTORS
  • Year:
  • 1994

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider two algorithm for on-line prediction based on a linear model. The algorithms are the well-known Gradient Descent (GD) algorithm and a new algorithm, which we call EG(+/-). They both maintain a weight vector using simple updates. For the GD algorithm, the update is based on subtracting the gradient of the squared error made on a prediction. The EG(+/-) algorithm uses the components of the gradient in the exponents of factors that are used in updating the weight vector multiplicatively. We present worst-case loss bounds for EG(+/-) and compare them to previously known bounds for the GD algorithm. The bounds suggest that the losses of the algorithms are in general incomparable, but EG(+/-) has a much smaller loss if only a few components of the input are relevant for the predictions. We have performed experiments, which show that our worst-case upper bounds are quite tight already on simple artificial data.