A nonlinear reinforcement scheme for stochastic learning automata

  • Authors:
  • Florin Stoica;Emil M. Popa

  • Affiliations:
  • Computer Science Department, University "Lucian Blaga" Sibiu, Sibiu, Romania;Computer Science Department, University "Lucian Blaga" Sibiu, Sibiu, Romania

  • Venue:
  • MMACTEE'06 Proceedings of the 8th WSEAS international conference on Mathematical methods and computational techniques in electrical engineering
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

A stochastic automaton can perform a finite number of actions in a random environment. When a specific action is performed, the environment responds by producing an environment output that is stochastically related to the action. This response may be favorable or unfavorable. The aim is to design an automaton that can determine the best action guided by past actions and responses. The reinforcement scheme presented is shown to satisfy all necessary and sufficient conditions for absolute expediency for a stationary environment. An automaton using this scheme is guaranteed to "do better" at every time step than at the previous step (expected value of the average penalty at one iteration step is less than of the previous step for all steps). Some simulation results are presented, which prove that our algorithm converges to a solution faster than the one given in [7].