Two-sided bandits and the dating market

Authors:
Sanmay Das;Emir Kamenica
Affiliations:
Center for Biological and Computational Learning and Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology, Cambridge, MA;Department of Economics, Harvard University, Cambridge, MA
Venue:
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Year:
2005

Citing 3
Cited 4

Multiagent learning using a variable learning rate

Artificial Intelligence
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
The Nonstochastic Multiarmed Bandit Problem

SIAM Journal on Computing

Anarchy, Stability, and Utopia: Creating Better Matchings

SAGT '09 Proceedings of the 2nd International Symposium on Algorithmic Game Theory
Matching, cardinal utility, and social welfare

ACM SIGecom Exchanges
Expert-mediated search

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Two-sided search with experts

Proceedings of the 13th ACM Conference on Electronic Commerce

Quantified Score

Hi-index	0.00

Visualization

Abstract

We study the decision problems facing agents in repeated matching environments with learning, or two-sided bandit problems, and examine the dating market, in which men and women repeatedly go out on dates and learn about each other, as an example. We consider three natural matching mechanisms and empirically examine properties of these mechanisms, focusing on the asymptotic stability of the resulting matchings when the agents use a simple learning rule coupled with an ε-greedy exploration policy. Matchings tend to be more stable when agents are patient in two different ways -- if they are more likely to explore early or if they are more optimistic. However, the two forms of patience do not interact well in terms of increasing the probability of stable outcomes. We also define a notion of regret for the two-sided problem and study the distribution of regrets under the different matching mechanisms.