Time-based reward shaping in real-time strategy games

  • Authors:
  • Martin Midtgaard;Lars Vinther;Jeppe R. Christiansen;Allan M. Christensen;Yifeng Zeng

  • Affiliations:
  • Aalborg University, Denmark;Aalborg University, Denmark;Aalborg University, Denmark;Aalborg University, Denmark;Aalborg University, Denmark

  • Venue:
  • ADMI'10 Proceedings of the 6th international conference on Agents and data mining interaction
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Real-Time Strategy (RTS) is a challenging domain for AI, since it involves not only a large state space, but also dynamic actions that agents execute concurrently. This problem cannot be optimally solved through general Q-learning techniques, so we propose a solution using a Semi Markov Decision Process (SMDP). We present a time-based reward shaping technique, TRS, to speed up the learning process in reinforcement learning. Especially, we show that our technique preserves the solution optimality for some SMDP problems. We evaluate the performance of our method in the Spring game Balanced Annihilation, and provide some benchmarks showing the performance of our approach.