Action Time Sharing Policies for Ergodic Control of Markov Chains

  • Authors:
  • Amarjit Budhiraja;Xin Liu;Adam Shwartz

  • Affiliations:
  • budhiraj@email.unc.edu;liuxin@ima.umn.edu;adam@ee.technion.ac.il

  • Venue:
  • SIAM Journal on Control and Optimization
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Ergodic control for discrete time controlled Markov chains with a locally compact state space and a compact action space is considered under suitable stability, irreducibility, and Feller continuity conditions. A flexible family of controls, called action time sharing (ATS) policies, associated with a given continuous stationary Markov control, is introduced. It is shown that the long-term average cost for such a control policy, for a broad range of one-stage cost functions, is the same as that for the associated stationary Markov policy. In addition, ATS policies are well suited for a range of estimation, information collection, and adaptive control goals. To illustrate the possibilities we present two examples. The first demonstrates a construction of an ATS policy that leads to consistent estimators for unknown model parameters while producing the desired long-term average cost value. The second example considers a setting where the target stationary Markov control $q$ is not known but there are sampling schemes available that allow for consistent estimation of $q$. We construct an ATS policy which uses dynamic estimators for $q$ for control decisions and show that the associated cost coincides with that for the unknown Markov control $q$.