On Relative Loss Bounds in Generalized Linear Regression

Authors:
Jürgen Forster
Affiliations:
-
Venue:
FCT '99 Proceedings of the 12th International Symposium on Fundamentals of Computation Theory
Year:
1999

Citing 3
Cited 4

Exponentiated gradient versus gradient descent for linear predictors

Information and Computation
Relative loss bounds for multidimensional regression problems

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Relative loss bounds for on-line density estimation with the exponential family of distributions

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence

Competitive online generalized linear regression under square loss

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
Re-adapting the regularization of weights for non-stationary regression

ALT'11 Proceedings of the 22nd international conference on Algorithmic learning theory
Weighted last-step min-max algorithm with improved sub-logarithmic regret

ALT'12 Proceedings of the 23rd international conference on Algorithmic Learning Theory
Kernelization of matrix updates, when and how?

ALT'12 Proceedings of the 23rd international conference on Algorithmic Learning Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

When relative loss bounds are considered, an on-line learning algorithm is compared to the performance of a class of off-line algorithms, called experts. In this paper we reconsider a result by Vovk, namely an upper bound on the on-line relative loss for linear regression with square loss - here the experts are linear functions. We give a shorter and simpler proof of Vovk's result and give a new motivation for the choice of the predictions of Vovk's learning algorithm. This is done by calculating the, in some sense, best prediction for the last trial of a sequence of trials when it is known that the outcome variable is bounded. We try to generalize these ideas to the case of generalized linear regression where the experts are neurons and give a formula for the "best" prediction for the last trial in this case, too. This prediction turns out to be essentially an integral over the "best" expert applied to the last instance. Predictions that are "optimal" in this sense might be good predictions for long sequences of trials as well.