The shortest path problem under partial monitoring

Authors:
András György;Tamás Linder;György Ottucsák
Affiliations:
Informatics Laboratory, Computer and Automation Research Institute of the Hungarian Academy of Sciences, Budapest, Hungary;Informatics Laboratory, Computer and Automation Research Institute of the Hungarian Academy of Sciences, Budapest, Hungary;Department of Computer Science and Information Theory, Budapest University of Technology and Economics, Budapest, Hungary
Venue:
COLT'06 Proceedings of the 19th annual conference on Learning Theory
Year:
2006

Citing 18
Cited 0

Aggregating strategies

COLT '90 Proceedings of the third annual workshop on Computational learning theory
The weighted majority algorithm

Information and Computation
Predicting Nearly As Well As the Best Pruning of a Decision Tree

Machine Learning - Special issue on the eighth annual conference on computational learning theory, (COLT '95)
How to use expert advice

Journal of the ACM (JACM)
Tracking the Best Expert

Machine Learning - Special issue on context sensitivity and concept drift
Derandomizing Stochastic Prediction Strategies

Machine Learning - Special issue: computational learning theory, COLT '97
Measurement and performance of a cognitive packet network

Computer Networks: The International Journal of Computer and Telecommunications Networking - Special issue on networking middleware: selected papers from the TERENA networking conference 2001
The Nonstochastic Multiarmed Bandit Problem

SIAM Journal on Computing
Tracking a small set of experts by mixing past posteriors

The Journal of Machine Learning Research
Path kernels and multiplicative updates

The Journal of Machine Learning Research
A "Follow the Perturbed Leader"-type Algorithm for Zero-Delay Quantization of Individual Sequences

DCC '04 Proceedings of the Conference on Data Compression
Adaptive routing with end-to-end feedback: distributed learning and geometric approaches

STOC '04 Proceedings of the thirty-sixth annual ACM symposium on Theory of computing
Autonomous Smart Routing for Network QoS

ICAC '04 Proceedings of the First International Conference on Autonomic Computing
Prediction, Learning, and Games

Prediction, Learning, and Games
Adaptive Routing Using Expert Advice

The Computer Journal
Tracking the best of many experts

COLT'05 Proceedings of the 18th annual conference on Learning Theory
Efficient adaptive algorithms and minimax bounds for zero-delay lossy source coding

IEEE Transactions on Signal Processing
Minimizing regret with label efficient prediction

IEEE Transactions on Information Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

The on-line shortest path problem is considered under partial monitoring scenarios. At each round, a decision maker has to choose a path between two distinguished vertices of a weighted directed acyclic graph whose edge weights can change in an arbitrary (adversarial) way such that the loss of the chosen path (defined as the sum of the weights of its composing edges) be small. In the multi-armed bandit setting, after choosing a path, the decision maker learns only the weights of those edges that belong to the chosen path. For this scenario, an algorithm is given whose average cumulative loss in n rounds exceeds that of the best path, matched off-line to the entire sequence of the edge weights, by a quantity that is proportional to $1/\sqrt{n}$and depends only polynomially on the number of edges of the graph. The algorithm can be implemented with linear complexity in the number of rounds n and in the number of edges. This result improves earlier bandit-algorithms which have performance bounds that either depend exponentially on the number of edges or converge to zero at a slower rate than $O(1/\sqrt{n})$. An extension to the so-called label efficient setting is also given, where the decision maker is informed about the weight of the chosen path only with probability ε