On improving the performance of simulation-based algorithms for average reward processes with application to network pricing

Authors:
Enrique Campos-Náñez;Stephen D. Patek
Affiliations:
University of Virginia, Charlottesville, VA;University of Virginia, Charlottesville, VA
Venue:
Proceedings of the 33nd conference on Winter simulation
Year:
2001

Citing 8
Cited 1

Infinitesimal perturbation analysis for general discrete event systems

Journal of the ACM (JACM)
Perturbation analysis gives strongly consistent sensitivity estimates for the M/G/1 queue

Management Science
Adaptive algorithms and stochastic approximations

Adaptive algorithms and stochastic approximations
Variance and bias reduction techniques for the harmonic gradient estimator

Applied Mathematics and Computation
Stochastic approximation for Monte Carlo optimization

WSC '86 Proceedings of the 18th conference on Winter simulation
Likelilood ratio gradient estimation: an overview

WSC '87 Proceedings of the 19th conference on Winter simulation
Congestion-dependent pricing of network services

IEEE/ACM Transactions on Networking (TON)
Neuro-Dynamic Programming

Neuro-Dynamic Programming

Decentralized algorithms for adaptive pricing in multiclass loss networks

IEEE/ACM Transactions on Networking (TON)

Quantified Score

Hi-index	0.00

Visualization

Abstract

We address performance issues associated with simulation-based algorithms for optimizing Markov reward processes. Specifically, we are concerned with algorithms that exploit the regenerative structure of the process in estimating the gradient of the objective function with the respect to control parameters. In many applications, states which initially have short expected return-times may eventually become infrequently visited as the control parameters are updated. As a result, unbiased updates to the control parameters can become so infrequent as to render the algorithm impractical. The performance of these algorithms can be significantly improved by adapting the state which is used to mark regenerative cycles. In this paper, we introduce such an adaptation procedure, give initial arguments for its convergence properties, and illustrate its application in two numerical examples. The examples relate to the optimal pricing of communication network resources for congestion-controlled traffic.