Online optimization of replacement policies using learning automata
International Journal of Systems Science
Hi-index | 0.00 |
Abstract: An absorbing learning automaton which is based on the use of a stochastic estimator is introduced. According to the proposed stochastic estimator scheme, the estimates of the reward probabilities are computed stochastically. Actions that have not been selected many times have the opportunity to be estimated as optimal, to increase their choice probabilities, and consequently, to be selected. In this way, the automaton's accuracy is significantly improved. This proposed automaton is proven to be absolutely expedient in all stationary environments, while the simulation results demonstrate that the proposed scheme achieves a significantly higher performance in comparison with the deterministic estimator based schemes.