Worst-Case Bounds for the Logarithmic Loss of Predictors

  • Authors:
  • Nicolò Cesa-Bianchi;Gábor Lugosi

  • Affiliations:
  • Department of Information Technologies, University of Milan, Via Bramante 65, 26013 Crema, Italy. cesabian@dsi.unimi.it;Department of Economics, Pompeu Fabra University, Ramon Trias Fargas 25-27, 08005 Barcelona, Spain. lugosi@upf.es

  • Venue:
  • Machine Learning
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

We investigate on-line prediction of individual sequences. Given a class of predictors, the goal is to predict as well as the best predictor in the class, where the loss is measured by the self information (logarithmic) loss function. The excess loss (regret) is closely related to the redundancy of the associated lossless universal code. Using Shtarkov's theorem and tools from empirical process theory, we prove a general upper bound on the best possible (minimax) regret. The bound depends on certain metric properties of the class of predictors. We apply the bound to both parametric and nonparametric classes of predictors. Finally, we point out a suboptimal behavior of the popular Bayesian weighted average algorithm.