An incentive-compatible multi-armed bandit mechanism

  • Authors:
  • Rica Gonen;Elan Pavlov

  • Affiliations:
  • Yahoo! Research, Sunnyvale, CA;MIT Media lab, Cambridge, MA

  • Venue:
  • Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a truthful sponsored search auction based on an incentive-compatible multi-armed bandit mechanism. The mechanism described combines several desirable traits. The mechanism gives advertisers the incentive to report their true bid, learns the click-through rate for advertisements, allows for slots with different quality, and loses the minimum welfare during the sampling process. The underlying generalization of the multi-armed bandit mechanism addresses the interplay between exploration and exploitation in an online setting that is truthful in high probability while allowing for slots of different quality. As the mechanism progresses the algorithm more closely approximates the hidden variables (click-though rates) in order to allocate advertising slots to the best advertisements. The resulting mechanism obtains the optimal welfare apart from a tightly bounded loss of welfare caused by the bandit sampling process. Of independent interest, in the field of economics it has long been recognized that preference elicitation is difficult to achieve, mainly as people are unaware of how much happiness a particular good will bring to them. In this paper we alleviate this problem somewhat by introducing a valuation-discovery process to the mechanism which results in a preference-elicitation mechanism for advertisers and search engines.