Learning the IPA market with individual and social rewards
Web Intelligence and Agent Systems
Hi-index | 0.00 |
A dynamic pricing problem is solved by using asymmetric multiagent reinforcement learning in this article. In the problem, there are two competing brokers that sell identical products to customers and compete on the basis of price. We model this dynamic pricing problem as a Markov game and solve it by using two different learning methods. The first method utilizes modified gradient descent in the parameter space of the value function approximator and the second method uses a direct gradient of the parameterized policy function. We present a brief literature survey of pricing models based on multiagent reinforcement learning, introduce the basic concepts of Markov games, and solve the problem by using proposed methods. © 2006 Wiley Periodicals, Inc. Int J Int Syst 21: 73–98, 2006.