The weighted majority algorithm
Information and Computation
Artificial Intelligence - special issue on computational tradeoffs under bounded resources
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
The Nonstochastic Multiarmed Bandit Problem
SIAM Journal on Computing
Heavy-Tailed Phenomena in Satisfiability and Constraint Satisfaction Problems
Journal of Automated Reasoning
Local Search Algorithms for SAT: An Empirical Evaluation
Journal of Automated Reasoning
A perspective view and survey of meta-learning
Artificial Intelligence Review
Algorithm Selection using Reinforcement Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Learning the Empirical Hardness of Optimization Problems: The Case of Combinatorial Auctions
CP '02 Proceedings of the 8th International Conference on Principles and Practice of Constraint Programming
Eighteenth national conference on Artificial intelligence
Optimal schedules for parallelizing anytime algorithms: the case of independent processes
Eighteenth national conference on Artificial intelligence
Gambling in a rigged casino: The adversarial multi-armed bandit problem
FOCS '95 Proceedings of the 36th Annual Symposium on Foundations of Computer Science
Introduction to the Special Issue on Meta-Learning
Machine Learning
Learning dynamic algorithm portfolios
Annals of Mathematics and Artificial Intelligence
Improved second-order bounds for prediction with expert advice
Machine Learning
Cross-disciplinary perspectives on meta-learning for algorithm selection
ACM Computing Surveys (CSUR)
An asymptotically optimal algorithm for the max k-armed bandit problem
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
The max K-armed bandit: a new model of exploration applied to search heuristic selection
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 3
Combining multiple heuristics online
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Optimal schedules for parallelizing anytime algorithms: the case of shared resources
Journal of Artificial Intelligence Research
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
A neural network model for inter-problem adaptive online time allocation
ICANN'05 Proceedings of the 15th international conference on Artificial neural networks: formal models and their applications - Volume Part II
Hannan consistency in on-line learning in case of unbounded losses under partial monitoring
ALT'06 Proceedings of the 17th international conference on Algorithmic Learning Theory
STACS'06 Proceedings of the 23rd Annual conference on Theoretical Aspects of Computer Science
Diversification and determinism in local search for satisfiability
SAT'05 Proceedings of the 8th international conference on Theory and Applications of Satisfiability Testing
Improved second-order bounds for prediction with expert advice
COLT'05 Proceedings of the 18th annual conference on Learning Theory
Efficient multi-start strategies for local search algorithms
Journal of Artificial Intelligence Research
Algorithm portfolio selection as a bandit problem with unbounded losses
Annals of Mathematics and Artificial Intelligence
Hi-index | 0.00 |
Algorithm selection is typically based on models of algorithm performance learned during a separate offline training sequence, which can be prohibitively expensive. In recent work, we adopted an online approach, in which a performance model is iteratively updated and used to guide selection on a sequence of problem instances. The resulting exploration-exploitation trade-off was represented as a bandit problem with expert advice, using an existing solver for this game, but this required the setting of an arbitrary bound on algorithm runtimes, thus invalidating the optimal regret of the solver. In this paper, we propose a simpler framework for representing algorithm selection as a bandit problem, with partial information, and an unknown bound on losses. We adapt an existing solver to this game, proving a bound on its expected regret, which holds also for the resulting algorithm selection technique. We present experiments with a set of SAT solvers on a mixed SAT-UNSAT benchmark.