Dyna-H: A heuristic planning reinforcement learning algorithm applied to role-playing game strategy decision systems

Authors:
Matilde Santos;José Antonio MartíN H.;Victoria LóPez;Guillermo Botella
Affiliations:
Computer Architectures and Automation, Complutense University of Madrid, Spain;Computer Architectures and Automation, Complutense University of Madrid, Spain;Computer Architectures and Automation, Complutense University of Madrid, Spain;Computer Architectures and Automation, Complutense University of Madrid, Spain
Venue:
Knowledge-Based Systems
Year:
2012

Citing 18
Cited 2

Dyna, an integrated architecture for learning, planning, and reacting

ACM SIGART Bulletin
Practical Issues in Temporal Difference Learning

Machine Learning
Technical Note: \cal Q-Learning

Machine Learning
Machine Learning

Machine Learning
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Introduction to Machine Learning (Adaptive Computation and Machine Learning)

Introduction to Machine Learning (Adaptive Computation and Machine Learning)
Guest Editorial Preface: Intelligent knowledge engineering systems

Knowledge-Based Systems
Allocating time and location information to activity-travel patterns through reinforcement learning

Knowledge-Based Systems
Adding variation to path planning

Computer Animation and Virtual Worlds - CASA'2008 Special Issue
Reinforcement learning of pedagogical policies in adaptive and intelligent educational systems

Knowledge-Based Systems
Reinforcement learning: a survey

Journal of Artificial Intelligence Research
MALADY: A Machine Learning-Based Autonomous Decision-Making System for Sensor Networks

CSE '09 Proceedings of the 2009 International Conference on Computational Science and Engineering - Volume 02
2010 Special Issue: Online learning of shaping rewards in reinforcement learning

Neural Networks
Limited-Damage A*: A path search algorithm that considers damage as a feasibility criterion

Knowledge-Based Systems
Non-deterministic policies in Markovian decision processes

Journal of Artificial Intelligence Research
Agent based decision support system using reinforcement learning under emergency circumstances

ICNC'05 Proceedings of the First international conference on Advances in Natural Computation - Volume Part I
A Comparative Study of Value Systems for Self-Motivated Exploration and Learning by Robots

IEEE Transactions on Autonomous Mental Development
A Comprehensive Survey of Multiagent Reinforcement Learning

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews

Learning classifier system with average reward reinforcement learning

Knowledge-Based Systems
An efficient L2-norm regularized least-squares temporal difference learning algorithm

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

In a role-playing game, finding optimal trajectories is one of the most important tasks. In fact, the strategy decision system becomes a key component of a game engine. Determining the way in which decisions are taken (e.g. online, batch or simulated) and the consumed resources in decision making (e.g. execution time, memory) will influence, to a major degree, the game performance. When classical search algorithms such as A^* can be used, they are the very first option. Nevertheless, such methods rely on precise and complete models of the search space so there are many interesting scenarios where its application is not possible, and hence, model free methods for sequential decision making under uncertainty are the best choice. In this paper, we propose a heuristic planning strategy to incorporate, into a Dyna agent, the ability of heuristic-search in path-finding. The proposed Dyna-H algorithm selects branches more likely to produce outcomes than other branches, just as A^* does. However, unlike A^*, it has the advantages of a model-free online reinforcement learning algorithm. We evaluate our proposed algorithm against the one-step Q-learning and Dyna-Q algorithms and found that the Dyna-H, with its advantages, produced clearly superior results.