Optimizing time warp simulation with reinforcement learning techniques

Authors:
Jun Wang;Carl Tropper
Affiliations:
McGill University, Montreal, Canada;McGill University, Montreal, Canada
Venue:
Proceedings of the 39th conference on Winter simulation: 40 years! The best is yet to come
Year:
2007

Citing 11
Cited 4

An analysis of rollback-based simulation

ACM Transactions on Modeling and Computer Simulation (TOMACS)
Selecting the checkpoint interval in time warp simulation

PADS '93 Proceedings of the seventh workshop on Parallel and distributed simulation
Probabilistic adaptive direct optimism control in Time Warp

PADS '95 Proceedings of the ninth workshop on Parallel and distributed simulation
Adaptive flow control in time warp

Proceedings of the eleventh workshop on Parallel and distributed simulation
A spectrum of options for parallel simulation

WSC '88 Proceedings of the 20th conference on Winter simulation
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Artificial Intelligence: A Modern Approach

Artificial Intelligence: A Modern Approach
Simulation-Based Optimization: Parametric Optimization Techniques and Reinforcement Learning

Simulation-Based Optimization: Parametric Optimization Techniques and Reinforcement Learning
Cooperative Multi-Agent Learning: The State of the Art

Autonomous Agents and Multi-Agent Systems
Reinforcement learning: a survey

Journal of Artificial Intelligence Research
Adaptive load balancing: a study in multi-agent learning

Journal of Artificial Intelligence Research

Selecting GVT interval for time-warp-based distributed simulation using reinforcement learning technique

SpringSim '09 Proceedings of the 2009 Spring Simulation Multiconference
On the scalability and dynamic load-balancing of optimistic gate level simulation

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Evaluation of reinforcement learning techniques

Proceedings of the First International Conference on Intelligent Interactive Technologies and Multimedia
Using genetic algorithms to limit the optimism in time warp

Winter Simulation Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

Adaptive Time Warp protocols in the literature are usually based on a pre-defined analytic model of the system, expressed as a closed form function that maps system state to control parameter. The underlying assumption is that this model itself is optimal. In this paper we present a new approach that utilizes Reinforcement Learning techniques, also known as simulation-based dynamic programming. Instead of assuming an optimal control strategy, the very goal of Reinforcement Learning is to find the optimal strategy through simulation. A value function that captures the history of system feedbacks is used, and no prior knowledge of the system is required. Our reinforcement learning techniques were implemented in a distributed VLSI simulator with the objective of finding the optimal size of a bounded time window. Our experiments using two benchmark circuits indicated that it was successful in doing so.