Simulation of sequential data: An enhanced reinforcement learning approach

  • Authors:
  • Marlies Vanhulsel;Davy Janssens;Geert Wets;Koen Vanhoof

  • Affiliations:
  • Hasselt University - Campus Diepenbeek, Transportation Research Institute, Wetenschapspark 5 Bus 6, B-3590 Diepenbeek, Belgium;Hasselt University - Campus Diepenbeek, Transportation Research Institute, Wetenschapspark 5 Bus 6, B-3590 Diepenbeek, Belgium;Hasselt University - Campus Diepenbeek, Transportation Research Institute, Wetenschapspark 5 Bus 6, B-3590 Diepenbeek, Belgium;Hasselt University - Campus Diepenbeek, Transportation Research Institute, Wetenschapspark 5 Bus 6, B-3590 Diepenbeek, Belgium

  • Venue:
  • Expert Systems with Applications: An International Journal
  • Year:
  • 2009

Quantified Score

Hi-index 12.05

Visualization

Abstract

The present study aims at contributing to the current state-of-the art of activity-based travel demand modelling by presenting a framework to simulate sequential data. To this end, the suitability of a reinforcement learning approach to reproduce sequential data is explored. Additionally, as traditional reinforcement learning techniques are not capable of learning efficiently in large state and action spaces with respect to memory and computational time requirements on the one hand, and of generalizing based on infrequent visits of all state-action pairs on the other hand, the reinforcement learning technique as used in most applications, is enhanced by means of regression tree function approximation. Three reinforcement learning algorithms are implemented to validate their applicability: the traditional Q-learning and Q-learning with bucket-brigade updating are tested against the improved reinforcement learning approach with a CART function approximator. These methods are applied on data of 26 diary days. The results are promising and show that the proposed techniques offer great opportunity of simulating sequential data. Moreover, the reinforcement learning approach improved by introducing a regression tree function approximator learns a more optimal solution much faster than the two traditional Q-learning approaches.