Online learning with expert advice and finite-horizon constraints

Authors:
Branislav Kveton;Jia Yuan Yu;Georgios Theocharous;Shie Mannor
Affiliations:
Intel Research, Santa Clara, CA;Department of Electrical and Computer Engineering, McGill University;Intel Research, Santa Clara, CA;Department of Electrical and Computer Engineering, McGill University
Venue:
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 1
Year:
2008

Citing 6
Cited 0

The weighted majority algorithm

Information and Computation
Adaptive disk spin—down for mobile computers

Mobile Networks and Applications
Prediction, Learning, and Games

Prediction, Learning, and Games
Dynamic power management using machine learning

Proceedings of the 2006 IEEE/ACM international conference on Computer-aided design
Adaptive timeout policies for fast fine-grained power management

IAAI'07 Proceedings of the 19th national conference on Innovative applications of artificial intelligence - Volume 2
Online learning with constraints

COLT'06 Proceedings of the 19th annual conference on Learning Theory

Quantified Score

Hi-index	0.04

Visualization

Abstract

In this paper, we study a sequential decision making problem. The objective is to maximize the average reward accumulated over time subject to temporal cost constraints. The novelty of our setup is that the rewards and constraints are controlled by an adverse opponent. To solve our problem in a practical way, we propose an expert algorithm that guarantees both a vanishing regret and a sublinear number of violated constraints. The quality of this solution is demonstrated on a real-world power management problem. Our results support the hypothesis that online learning with convex cost constraints can be performed successfully in practice.