Learning tetris using the noisy cross-entropy method

Authors:
István Szita;András Lörincz
Affiliations:
Department of Information Systems, Eötvös Loránd University, Budapest, Hungary;Department of Information Systems, Eötvös Loránd University, Budapest, Hungary
Venue:
Neural Computation
Year:
2006

Citing 3
Cited 18

Neuro-Dynamic Programming

Neuro-Dynamic Programming
Least-Squares Methods in Reinforcement Learning for Control

SETN '02 Proceedings of the Second Hellenic Conference on AI: Methods and Applications of Artificial Intelligence
Tetris is hard, even to approximate

COCOON'03 Proceedings of the 9th annual international conference on Computing and combinatorics

An empirical analysis of value function-based and policy search reinforcement learning

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Evolving an autonomous agent for non-Markovian reinforcement learning

Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Apply ant colony optimization to Tetris

Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Interactively shaping agents via human reinforcement: the TAMER framework

Proceedings of the fifth international conference on Knowledge capture
Machine learning in digital games: a survey

Artificial Intelligence Review
Learning to play using low-complexity rule-based policies: illustrations through Ms. Pac-Man

Journal of Artificial Intelligence Research
Goal-directed feature learning

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
On the evolution of artificial Tetris players

CIG'09 Proceedings of the 5th international conference on Computational Intelligence and Games
Critical factors in the empirical performance of temporal difference and evolutionary methods for reinforcement learning

Autonomous Agents and Multi-Agent Systems
Training a Tetris agent via interactive shaping: a demonstration of the TAMER framework

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Automatic induction of bellman-error features for probabilistic planning

Journal of Artificial Intelligence Research
Learning cost-efficient control policies with XCSF: generalization capabilities and further improvement

Proceedings of the 13th annual conference on Genetic and evolutionary computation
Robust Approximate Bilinear Programming for Value Function Approximation

The Journal of Machine Learning Research
Approximate Dynamic Programming via a Smoothed Linear Program

Operations Research
Identifying effective policies in approximate dynamic programming: beyond regression

Proceedings of the Winter Simulation Conference
Reducing the learning time of tetris in evolution strategies

EA'11 Proceedings of the 10th international conference on Artificial Evolution
Using informative behavior to increase engagement in the tamer framework

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Policy oscillation is overshooting

Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

The cross-entropy method is an efficient and general optimization algorithm. However, its applicability in reinforcement learning (RL) seems to be limited because it often converges to suboptimal policies. We apply noise for preventing early convergence of the cross-entropy method, using Tetris, a computer game, for demonstration. The resulting policy outperforms previous RL algorithms by almost two orders of magnitude.