Algorithm portfolio selection as a bandit problem with unbounded losses

Authors:
Matteo Gagliolo;Jürgen Schmidhuber
Affiliations:
CoMo, Vrije Universiteit Brussel, Brussels, Belgium 1050;IDSIA, Manno (Lugano), Switzerland 6928 and Faculty of Informatics, University of Lugano, Lugano, Switzerland 6904
Venue:
Annals of Mathematics and Artificial Intelligence
Year:
2011

Citing 29
Cited 1

Optimal speedup of Las Vegas algorithms

Information Processing Letters
The weighted majority algorithm

Information and Computation
Computational tradeoffs under bounded resources

Artificial Intelligence - special issue on computational tradeoffs under bounded resources
Algorithm portfolios

Artificial Intelligence - special issue on computational tradeoffs under bounded resources
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Machine Learning

Machine Learning
The Nonstochastic Multiarmed Bandit Problem

SIAM Journal on Computing
Heavy-Tailed Phenomena in Satisfiability and Constraint Satisfaction Problems

Journal of Automated Reasoning
Learning the Empirical Hardness of Optimization Problems: The Case of Combinatorial Auctions

CP '02 Proceedings of the 8th International Conference on Principles and Practice of Constraint Programming
Optimal schedules for parallelizing anytime algorithms: the case of independent processes

Eighteenth national conference on Artificial intelligence
Stochastic Local Search: Foundations & Applications

Stochastic Local Search: Foundations & Applications
Learning dynamic algorithm portfolios

Annals of Mathematics and Artificial Intelligence
Improved second-order bounds for prediction with expert advice

Machine Learning
Learning parallel portfolios of algorithms

Annals of Mathematics and Artificial Intelligence
Using online algorithms to solve np-hard problems more efficiently in practice

Using online algorithms to solve np-hard problems more efficiently in practice
Cross-disciplinary perspectives on meta-learning for algorithm selection

ACM Computing Surveys (CSUR)
Reactive Search and Intelligent Optimization

Reactive Search and Intelligent Optimization
Mixed-Effects Modeling of Optimisation Algorithm Performance

SLS '09 Proceedings of the Second International Workshop on Engineering Stochastic Local Search Algorithms. Designing, Implementing and Analyzing Effective Heuristics
Combining multiple heuristics online

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Optimal schedules for parallelizing anytime algorithms: the case of shared resources

Journal of Artificial Intelligence Research
SATzilla: portfolio-based algorithm selection for SAT

Journal of Artificial Intelligence Research
Learning restart strategies

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
SATzilla-07: the design and analysis of an algorithm portfolio for SAT

CP'07 Proceedings of the 13th international conference on Principles and practice of constraint programming
Algorithm selection as a bandit problem with unbounded losses

LION'10 Proceedings of the 4th international conference on Learning and intelligent optimization
A neural network model for inter-problem adaptive online time allocation

ICANN'05 Proceedings of the 15th international conference on Artificial neural networks: formal models and their applications - Volume Part II
Hannan consistency in on-line learning in case of unbounded losses under partial monitoring

ALT'06 Proceedings of the 17th international conference on Algorithmic Learning Theory
Combining multiple heuristics

STACS'06 Proceedings of the 23rd Annual conference on Theoretical Aspects of Computer Science
Improved second-order bounds for prediction with expert advice

COLT'05 Proceedings of the 18th annual conference on Learning Theory
Minimizing regret with label efficient prediction

IEEE Transactions on Information Theory

Which algorithm should i choose at any point of the search: an evolutionary portfolio approach

Proceedings of the 15th annual conference on Genetic and evolutionary computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a method that learns to allocate computation time to a given set of algorithms, of unknown performance, with the aim of solving a given sequence of problem instances in a minimum time. Analogous meta-learning techniques are typically based on models of algorithm performance, learned during a separate offline training sequence, which can be prohibitively expensive. We adopt instead an online approach, named GAMBLETA, in which algorithm performance models are iteratively updated, and used to guide allocation on a sequence of problem instances. GAMBLETA is a general method for selecting among two or more alternative algorithm portfolios. Each portfolio has its own way of allocating computation time to the available algorithms, possibly based on performance models, in which case its performance is expected to improve over time, as more runtime data becomes available. The resulting exploration-exploitation trade-off is represented as a bandit problem. In our previous work, the algorithms corresponded to the arms of the bandit, and allocations evaluated by the different portfolios were mixed, using a solver for the bandit problem with expert advice, but this required the setting of an arbitrary bound on algorithm runtimes, invalidating the optimal regret of the solver. In this paper, we propose a simpler version of GAMBLETA, in which the allocators correspond to the arms, such that a single portfolio is selected for each instance. The selection is represented as a bandit problem with partial information, and an unknown bound on losses. We devise a solver for this game, proving a bound on its expected regret. We present experiments based on results from several solver competitions, in various domains, comparing GAMBLETA with another online method.