Adaptive Online Prediction by Following the Perturbed Leader

Authors:
Marcus Hutter;Jan Poland
Affiliations:
-;-
Venue:
The Journal of Machine Learning Research
Year:
2005

Citing 0
Cited 13

Nonstochastic bandits: Countable decision set, unbounded costs and reactive environments

Theoretical Computer Science
Consistency of discrete Bayesian learning

Theoretical Computer Science
Following the Perturbed Leader to Gamble at Multi-armed Bandits

ALT '07 Proceedings of the 18th international conference on Algorithmic Learning Theory
The weak aggregating algorithm and weak mixability

Journal of Computer and System Sciences
A new understanding of prediction markets via no-regret learning

Proceedings of the 11th ACM conference on Electronic commerce
Online Learning in Case of Unbounded Losses Using Follow the Perturbed Leader Algorithm

The Journal of Machine Learning Research
Combining initial segments of lists

ALT'11 Proceedings of the 22nd international conference on Algorithmic learning theory
The missing consistency theorem for bayesian learning: stochastic model selection

ALT'06 Proceedings of the 17th international conference on Algorithmic Learning Theory
Defensive universal learning with experts

ALT'05 Proceedings of the 16th international conference on Algorithmic Learning Theory
On following the perturbed leader in the bandit setting

ALT'05 Proceedings of the 16th international conference on Algorithmic Learning Theory
FPL analysis for adaptive bandits

SAGA'05 Proceedings of the Third international conference on StochasticAlgorithms: foundations and applications
Online Multiple Kernel Classification

Machine Learning
Combining initial segments of lists

Theoretical Computer Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

When applying aggregating strategies to Prediction with Expert Advice (PEA), the learning rate must be adaptively tuned. The natural choice of sqrt(complexity/current loss) renders the analysis of Weighted Majority (WM) derivatives quite complicated. In particular, for arbitrary weights there have been no results proven so far. The analysis of the alternative Follow the Perturbed Leader (FPL) algorithm from Kalai and Vempala (2003) based on Hannan's algorithm is easier. We derive loss bounds for adaptive learning rate and both finite expert classes with uniform weights and countable expert classes with arbitrary weights. For the former setup, our loss bounds match the best known results so far, while for the latter our results are new.