Bootstrapping learning from abstract models in games

Authors:
Purvag Patel;Normal Carver;Shahram Rahimi
Affiliations:
Department of Computer Science, Southern Illinois University, Carbondale, IL, 62901, USA;Department of Computer Science, Southern Illinois University, Carbondale, IL, 62901, USA;Department of Computer Science, Southern Illinois University, Carbondale, IL, 62901, USA
Venue:
International Journal of Bio-Inspired Computation
Year:
2013

Citing 5
Cited 0

AI for Game Developers

AI for Game Developers
RETALIATE: learning winning policies in first-person shooter games

IAAI'07 Proceedings of the 19th national conference on Innovative applications of artificial intelligence - Volume 2
Neural networks training for weapon selection in first-person shooter games

ICANN'10 Proceedings of the 20th international conference on Artificial neural networks: Part III
Unreal goal bots: conceptual design of a reusable interface

Agents for games and simulations II
Integrating Temporal Difference Methods and Self-Organizing Neural Networks for Reinforcement Learning With Delayed Evaluative Feedback

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

Computer gaming environments are real time, dynamic, and complex, with incomplete knowledge of the world. Agents in such environments require detailed models of the world if they are to learn effective policies. Machine learning techniques such as reinforcement learning can become intractably large, detailed world models. In this paper we tackle the well-known problem of low convergence speed in reinforcement learning for the detailed model of the world, specifically for video games. We propose first training the agents with an abstract model of the world and then using the resulting policy to initialise the system prior to training the agent with the detailed model of the world. This paper reports on results from applying the proposed technique to the classic arcade game Asteroids. Our experiments show that an agent can quickly learn a policy with the abstract model, and that when this policy's learned values are used to initialise the detailed model, learning with the detailed model improves the rate of convergence.