Can agents acquire human-like behaviors in a sequential bargaining game?: comparison of Roth's and Q-learning agents

Authors:
Keiki Takadama;Tetsuro Kawai;Yuhsuke Koyama
Affiliations:
The University of Electro-Communications, Chofu, Tokyo, Japan;Sony Corporation, Tokyo, Japan;Tokyo Institute of Technology, Yokohama, Japan
Venue:
MABS'06 Proceedings of the 2006 international conference on Multi-agent-based simulation VII
Year:
2006

Citing 4
Cited 2

Technical Note: \cal Q-Learning

Machine Learning
Bargaining theory with applications

Bargaining theory with applications
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Human vs. Computer Behaviour in Multi-Issue Negotiation

RRS '05 Proceedings of the Rational, Robust, and Secure Negotiation Mechanisms in Multi-Agent Systems (RRS'05) on Multi-Agent Systems

Simulating human-like decisions in a memory-based agent model

Computational & Mathematical Organization Theory
Path selection in disaster response management based on Q-learning

International Journal of Automation and Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper addresses agent modeling in multiagent-based simulation (MABS) to explore agents who can reproduce humanlike behaviors in the sequential bargaining game, which is more difficult to be reproduced than in the ultimate game (i.e., one time bargaining game). For this purpose, we focus on the Roth's learning agents who can reproduce human-like behaviors in several simple examples including the ultimate game, and compare simulation results of Roth's learning agents and Q-learning agents in the sequential bargaining game. Intensive simulations have revealed the following implications: (1) Roth's basic and three parameter reinforcement learning agents with any type of three action selections (i.e., Ɛ-greed, roulette, and Boltzmann distribution selections) can neither learn consistent behaviors nor acquire sequential negotiation in sequential bargaining game; and (2) Q-learning agents with any type of three action selections, on the other hand, can learn consistent behaviors and acquire sequential negotiation in the same game. However, Q-learning agents cannot reproduce the decreasing trend found in subject experiments.