Sampled fictitious play for approximate dynamic programming
Computers and Operations Research
KI'05 Proceedings of the 28th annual German conference on Advances in Artificial Intelligence
Approximate stochastic annealing for online control of infinite horizon Markov decision processes
Automatica (Journal of IFAC)
Hi-index | 0.00 |
The problem of analyzing the finite time behavior of learning automata is considered. This problem involves the finite time analysis of the learning algorithm used by the learning automaton and is important in determining the rate of convergence of the automaton. In this paper, a general framework for analyzing the finite time behavior of the automaton learning algorithms is proposed. Using this framework, the finite time analysis of the Pursuit Algorithm is presented. We have considered both continuous and discretized forms of the pursuit algorithm. Based on the results of the analysis, we compare the rates of convergence of these two versions of the pursuit algorithm. At the end of the paper, we also compare our framework with that of Probably Approximately Correct (PAC) learning