An incentive-compatible multi-armed bandit mechanism

Authors:
Rica Gonen;Elan Pavlov
Affiliations:
Yahoo! Research, Sunnyvale, CA;MIT Media lab, Cambridge, MA
Venue:
Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing
Year:
2007

Citing 4
Cited 10

An approximate truthful mechanism for combinatorial auctions with single parameter agents

SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
PAC Bounds for Multi-armed Bandit and Markov Decision Processes

COLT '02 Proceedings of the 15th Annual Conference on Computational Learning Theory
Anytime algorithms for multi-armed bandit problems

SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Truthful auctions for pricing search keywords

EC '06 Proceedings of the 7th ACM conference on Electronic commerce

Dynamic cost-per-action mechanisms and applications to online advertising

Proceedings of the 17th international conference on World Wide Web
On the Hardness of Truthful Online Auctions with Multidimensional Constraints

CiE '08 Proceedings of the 4th conference on Computability in Europe: Logic and Theory of Algorithms
Sponsored Search Auctions with Reserve Prices: Going Beyond Separability

WINE '08 Proceedings of the 4th International Workshop on Internet and Network Economics
Characterizing truthful multi-armed bandit mechanisms: extended abstract

Proceedings of the 10th ACM conference on Electronic commerce
The price of truthfulness for pay-per-click auctions

Proceedings of the 10th ACM conference on Electronic commerce
Maintaining equilibria during exploration in sponsored search auctions

WINE'07 Proceedings of the 3rd international conference on Internet and network economics
An adaptive sponsored search mechanism δ-gain truthful in valuation, time, and budget

WINE'07 Proceedings of the 3rd international conference on Internet and network economics
A truthful learning mechanism for contextual multi-slot sponsored search auctions with externalities

Proceedings of the 13th ACM Conference on Electronic Commerce
Learning and incentives in user-generated content: multi-armed bandits with endogenous arms

Proceedings of the 4th conference on Innovations in Theoretical Computer Science
Online learning for auction mechanism in bandit setting

Decision Support Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a truthful sponsored search auction based on an incentive-compatible multi-armed bandit mechanism. The mechanism described combines several desirable traits. The mechanism gives advertisers the incentive to report their true bid, learns the click-through rate for advertisements, allows for slots with different quality, and loses the minimum welfare during the sampling process. The underlying generalization of the multi-armed bandit mechanism addresses the interplay between exploration and exploitation in an online setting that is truthful in high probability while allowing for slots of different quality. As the mechanism progresses the algorithm more closely approximates the hidden variables (click-though rates) in order to allocate advertising slots to the best advertisements. The resulting mechanism obtains the optimal welfare apart from a tightly bounded loss of welfare caused by the bandit sampling process. Of independent interest, in the field of economics it has long been recognized that preference elicitation is difficult to achieve, mainly as people are unaware of how much happiness a particular good will bring to them. In this paper we alleviate this problem somewhat by introducing a valuation-discovery process to the mechanism which results in a preference-elicitation mechanism for advertisers and search engines.