Algorithmic mechanism design (extended abstract)
STOC '99 Proceedings of the thirty-first annual ACM symposium on Theory of computing
Finite-time Analysis of the Multiarmed Bandit Problem
Machine Learning
Randomized truthful auctions of digital goods are randomizations over truthful auctions
EC '04 Proceedings of the 5th ACM conference on Electronic commerce
An incentive-compatible multi-armed bandit mechanism
Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing
Dynamic cost-per-action mechanisms and applications to online advertising
Proceedings of the 17th international conference on World Wide Web
Truthful Approximation Schemes for Single-Parameter Agents
FOCS '08 Proceedings of the 2008 49th Annual IEEE Symposium on Foundations of Computer Science
On the Hardness of Being Truthful
FOCS '08 Proceedings of the 2008 49th Annual IEEE Symposium on Foundations of Computer Science
Characterizing truthful multi-armed bandit mechanisms: extended abstract
Proceedings of the 10th ACM conference on Electronic commerce
Characterizing truthful multi-armed bandit mechanisms: extended abstract
Proceedings of the 10th ACM conference on Electronic commerce
Truthful mechanisms with implicit payment computation
Proceedings of the 11th ACM conference on Electronic commerce
Value of learning in sponsored search auctions
WINE'10 Proceedings of the 6th international conference on Internet and network economics
Deviations of stochastic bandit regret
ALT'11 Proceedings of the 22nd international conference on Algorithmic learning theory
Repeated budgeted second price ad auction
SAGT'11 Proceedings of the 4th international conference on Algorithmic game theory
Dynamic pricing with limited supply
Proceedings of the 13th ACM Conference on Electronic Commerce
A truthful learning mechanism for contextual multi-slot sponsored search auctions with externalities
Proceedings of the 13th ACM Conference on Electronic Commerce
Proceedings of the 13th ACM Conference on Electronic Commerce
A truthful learning mechanism for multi-slot sponsored search auctions with externalities
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Learning and incentives in user-generated content: multi-armed bandits with endogenous arms
Proceedings of the 4th conference on Innovations in Theoretical Computer Science
Dynamic Pay-Per-Action Mechanisms and Applications to Online Advertising
Operations Research
Multi-parameter mechanisms with implicit payment computation
Proceedings of the fourteenth ACM conference on Electronic commerce
Robustness of stochastic bandit policies
Theoretical Computer Science
Machine learning in an auction environment
Proceedings of the 23rd international conference on World wide web
Repeated Budgeted Second Price Ad Auction
Theory of Computing Systems
Hi-index | 0.00 |
We analyze the problem of designing a truthful pay-per-click auction where the click-through-rates (CTR) of the bidders are unknown to the auction. Such an auction faces the classic explore/exploit dilemma: while gathering information about the click through rates of advertisers, the mechanism may loose revenue; however, this gleaned information may prove valuable in the future for a more profitable allocation. In this sense, such mechanisms are prime candidates to be designed using multi-armed bandit techniques. However, a naive application of multi-armed bandit algorithms would not take into account the strategic considerations of the players -- players might manipulate their bids (which determine the auction's revenue) in a way as to maximize their own utility. Hence, we consider the natural restriction that the auction be truthful. The revenue that we could hope to achieve is the expected revenue of a Vickrey auction that knows the true CTRs, and we define the truthful regret to be the difference between the expected revenue of the auction and this Vickrey revenue. This work sharply characterizes what regret is achievable, under a truthful restriction. We show that this truthful restriction imposes statistical limits on the achievable regret -- the achievable regret is Θ(T2/3), while for traditional bandit algorithms (without the truthful restriction) the achievable regret is Θ(T1/2) (where T is the number of rounds). We term the extra T1/6 factor, the `price of truthfulness'.