Playing the matching-shoulders lob-pass game with logarithmic regret

Authors:
Joe Kilian;Kevin J. Lang;Barak A. Pearlmutter
Affiliations:
NEC Research Institute;NEC Research Institute;Siemens Corporate Research
Venue:
COLT '94 Proceedings of the seventh annual conference on Computational learning theory
Year:
1994

Citing 2
Cited 2

Comparison-based search in the presence of errors

STOC '93 Proceedings of the twenty-fifth annual ACM symposium on Theory of computing
The “lob-pass” problem and an on-line learning model of rational choice

COLT '93 Proceedings of the sixth annual conference on Computational learning theory

A competitive approach to game learning

COLT '96 Proceedings of the ninth annual conference on Computational learning theory
A Computational Role for Dopamine Delivery in Human Decision-Making

Journal of Cognitive Neuroscience

Quantified Score

Hi-index	0.00

Visualization

Abstract

The best previous algorithm for the matching shoulders lob-pass game, ARTHUR (Abe and Takeuchi 1993), suffered O(t1/2) regret. We prove that this is the best possible performance for any algorithm that works by accurately estimating the opponent's payoff lines. Then we describe an algorithm which beats that bound and meets the information-theoretic lower bound of O(logt) regret by converging to the best lob rate without accurately estimating the payoff lines. The noise-tolerant binary search procedure that we develop is of independent interest.