Learning the IPA Market with Individual and Social Rewards

Authors:
Eduardo Rodrigues Gomes;Ryszard Kowalczyk
Affiliations:
-;-
Venue:
IAT '07 Proceedings of the 2007 IEEE/WIC/ACM International Conference on Intelligent Agent Technology
Year:
2007

Citing 0
Cited 2

Individual and Social Behaviour in the IPA Market with RL

SBIA '08 Proceedings of the 19th Brazilian Symposium on Artificial Intelligence: Advances in Artificial Intelligence
Dynamic analysis of multiagent Q-learning with ε-greedy exploration

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Market-based mechanisms offer a promising approach for distributed allocation of resources without centralized control. One of those mechanisms is the Iterative Price Adjustment (IPA). Under standard assumptions, the IPA uses demand functions that do not allow the agents to have preferences over some attributes of the allocation, e.g. different price or resource levels. One of the alternatives to address this limitation is to describe the agents' preferences using utility functions. In such a scenario, however, there is no unique mapping between the utility functions and a demand function. Gomes & Kowalczyk [10, 9] proposed the use of Reinforcement Learning to let the agents learn the demand functions given the utility functions. Their approach is based on the individual utilities of the agents at the end of the allocation. In this paper, we extend such a work by applying a new reward function, based on the social welfare of the allocation, and by considering more clients in the market. The learning process and the behavior of the agents using both reward functions are investigated through experiments and the results compared.