Pricing in Agent Economies Using Multi-Agent Q-Learning

Authors:
Gerald Tesauro;Jeffrey O. Kephart
Affiliations:
IBM Institute for Advanced Commerce, IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598, USA Tesauro@watson.ibm.com;IBM Institute for Advanced Commerce, IBM Thomas J. Watson Research Center, Yorktown Heights, NY 10598, USA kephart@us.ibm.com
Venue:
Autonomous Agents and Multi-Agent Systems
Year:
2002

Citing 10
Cited 14

Technical Note: \cal Q-Learning

Machine Learning
Temporal difference learning and TD-Gammon

Communications of the ACM
Price-war dynamics in a free-market economy of software agents

ALIFE Proceedings of the sixth international conference on Artificial life
Price dynamics of vertically differentiated information markets

Proceedings of the first international conference on Information and computation economies
Foresight-based pricing algorithms in an economy of software agents

Proceedings of the first international conference on Information and computation economies
Foresight-based pricing algorithms in agent economies

Decision Support Systems - Special issue on information and computational economics
Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Multi-agent Q-learning and Regression Trees for Automated Pricing Decisions

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
On Multiagent Q-Learning in a Semi-Competitive Domain

IJCAI '95 Proceedings of the Workshop on Adaption and Learning in Multi-Agent Systems
Shopbots and pricebots

IJCAI'99 Proceedings of the 16th international joint conference on Artifical intelligence - Volume 1

Game Theory and Decision Theory in Multi-Agent Systems

Autonomous Agents and Multi-Agent Systems
Co-evolutionary Auction Mechanism Design: A Preliminary Report

AAMAS '02 Revised Papers from the Workshop on Agent Mediated Electronic Commerce on Agent-Mediated Electronic Commerce IV, Designing Mechanisms and Systems
Multi-Attribute Dynamic Pricing for Online Markets Using Intelligent Agents

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 1
Cooperative Multi-Agent Learning: The State of the Art

Autonomous Agents and Multi-Agent Systems
Agent-Based Computational Economics: Growing Economies From the Bottom Up

Artificial Life
Application of complex adaptive systems to pricing of reproducible information goods

Decision Support Systems
Effectiveness of Q-learning as a tool for calibrating agent-based supply network models

Enterprise Information Systems
Learning the IPA market with individual and social rewards

Web Intelligence and Agent Systems
Multiagent learning in large anonymous games

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Virtual markets: Q-learning sellers with simple state representation

AIS-ADM'07 Proceedings of the 2nd international conference on Autonomous intelligent systems: agents and data mining
High-level reinforcement learning in strategy games

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Intelligent negotiation behaviour model for an open railway access market

Expert Systems with Applications: An International Journal
Auctions, evolution, and multi-agent learning

ALAMAS'05/ALAMAS'06/ALAMAS'07 Proceedings of the 5th , 6th and 7th European conference on Adaptive and learning agents and multi-agent systems: adaptation and multi-agent learning
Multiagent learning in large anonymous games

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper investigates how adaptive software agents may utilize reinforcement learning algorithms such as Q-learning to make economic decisions such as setting prices in a competitive marketplace. For a single adaptive agent facing fixed-strategy opponents, ordinary Q-learning is guaranteed to find the optimal policy. However, for a population of agents each trying to adapt in the presence of other adaptive agents, the problem becomes non-stationary and history dependent, and it is not known whether any global convergence will be obtained, and if so, whether such solutions will be optimal. In this paper, we study simultaneous Q-learning by two competing seller agents in three moderately realistic economic models. This is the simplest case in which interesting multi-agent phenomena can occur, and the state space is small enough so that lookup tables can be used to represent the Q-functions. We find that, despite the lack of theoretical guarantees, simultaneous convergence to self-consistent optimal solutions is obtained in each model, at least for small values of the discount parameter. In some cases, exact or approximate convergence is also found even at large discount parameters. We show how the Q-derived policies increase profitability and damp out or eliminate cyclic price “wars” compared to simpler policies based on zero lookahead or short-term lookahead. In one of the models (the “Shopbot” model) where the sellers' profit functions are symmetric, we find that Q-learning can produce either symmetric or broken-symmetry policies, depending on the discount parameter and on initial conditions.