Neuro-Dynamic Programming
Least-Squares Methods in Reinforcement Learning for Control
SETN '02 Proceedings of the Second Hellenic Conference on AI: Methods and Applications of Artificial Intelligence
Tetris is hard, even to approximate
COCOON'03 Proceedings of the 9th annual international conference on Computing and combinatorics
An empirical analysis of value function-based and policy search reinforcement learning
Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Evolving an autonomous agent for non-Markovian reinforcement learning
Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Apply ant colony optimization to Tetris
Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Interactively shaping agents via human reinforcement: the TAMER framework
Proceedings of the fifth international conference on Knowledge capture
Machine learning in digital games: a survey
Artificial Intelligence Review
Learning to play using low-complexity rule-based policies: illustrations through Ms. Pac-Man
Journal of Artificial Intelligence Research
Goal-directed feature learning
IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
On the evolution of artificial Tetris players
CIG'09 Proceedings of the 5th international conference on Computational Intelligence and Games
Autonomous Agents and Multi-Agent Systems
Training a Tetris agent via interactive shaping: a demonstration of the TAMER framework
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Automatic induction of bellman-error features for probabilistic planning
Journal of Artificial Intelligence Research
Proceedings of the 13th annual conference on Genetic and evolutionary computation
Robust Approximate Bilinear Programming for Value Function Approximation
The Journal of Machine Learning Research
Approximate Dynamic Programming via a Smoothed Linear Program
Operations Research
Identifying effective policies in approximate dynamic programming: beyond regression
Proceedings of the Winter Simulation Conference
Reducing the learning time of tetris in evolution strategies
EA'11 Proceedings of the 10th international conference on Artificial Evolution
Using informative behavior to increase engagement in the tamer framework
Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Policy oscillation is overshooting
Neural Networks
Hi-index | 0.00 |
The cross-entropy method is an efficient and general optimization algorithm. However, its applicability in reinforcement learning (RL) seems to be limited because it often converges to suboptimal policies. We apply noise for preventing early convergence of the cross-entropy method, using Tetris, a computer game, for demonstration. The resulting policy outperforms previous RL algorithms by almost two orders of magnitude.