Adaptive sampling based large-scale stochastic resource control

  • Authors:
  • Balázs Csanád Csáji;László Monostori

  • Affiliations:
  • Computer and Automation Research Institute, Hungarian Academy of Sciences, Budapest, Hungary;Computer and Automation Research Institute, Hungarian Academy of Sciences and Faculty of Mechanical Engineering, Budapest University of Technology and Economics

  • Venue:
  • AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider closed-loop solutions to stochastic optimization problems of resource allocation type. They concern with the dynamic allocation of reusable resources over time to non-preemtive interconnected tasks with stochastic durations. The aim is to minimize the expected value of a regular performance measure. First, we formulate the problem as a stochastic shortest path problem and argue that our formulation has favorable properties, e.g., it has finite horizon, it is acyclic, thus, all policies are proper, and moreover, the space of control policies can be safely restricted. Then, we propose an iterative solution. Essentially, we apply a reinforcement learning based adaptive sampler to compute a sub-optimal control policy. We suggest several approaches to enhance this solution and make it applicable to large-scale problems. The main improvements are: (1) the value function is maintained by feature-based support vector regression; (2) the initial exploration is guided by rollout algorithms; (3) the state space is partitioned by clustering the tasks while keeping the precedence constraints satisfied; (4) the action space is decomposed and, consequently, the number of available actions in a state is decreased; and, finally, (5) we argue that the sampling can be effectively distributed among several processors. The effectiveness of the approach is demonstrated by experimental results on both artificial (benchmark) and real-world (industry related) data.