Dynamic pricing by software agents
Computer Networks: The International Journal of Computer and Telecommunications Networking - electronic commerce
Finite-time Analysis of the Multiarmed Bandit Problem
Machine Learning
Competitive Auctions for Multiple Digital Goods
ESA '01 Proceedings of the 9th Annual European Symposium on Algorithms
Envy-free auctions for digital goods
Proceedings of the 4th ACM conference on Electronic commerce
Gambling in a rigged casino: The adversarial multi-armed bandit problem
FOCS '95 Proceedings of the 36th Annual Symposium on Foundations of Computer Science
The Value of Knowing a Demand Curve: Bounds on Regret for Online Posted-Price Auctions
FOCS '03 Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science
Online learning in online auctions
Theoretical Computer Science - Special issue: Online algorithms in memoriam, Steve Seiden
Learning algorithms for online principal-agent problems (and selling goods online)
ICML '06 Proceedings of the 23rd international conference on Machine learning
Dynamic and Non-uniform Pricing Strategies for Revenue Maximization
FOCS '09 Proceedings of the 2009 50th Annual IEEE Symposium on Foundations of Computer Science
Multi-armed bandit algorithms and empirical evaluation
ECML'05 Proceedings of the 16th European conference on Machine Learning
Hi-index | 0.00 |
Online digital goods auctions are settings where a seller with an unlimited supply of goods (e.g. music or movie downloads) interacts with a stream of potential buyers. In the posted price setting, the seller makes a take-it-or-leave-it offer to each arriving buyer. We study the seller's revenue maximization problem in posted-price auctions of digital goods. We find that algorithms from the multi-armed bandit literature like UCB, which come with good regret bounds, can be slow to converge. We propose and study two alternatives: (1) a scheme based on using Gittins indices with priors that make appropriate use of domain knowledge; (2) a new learning algorithm, LLVD, that assumes a linear demand curve, and maintains a Beta prior over the free parameter using a moment-matching approximation. LLVD is not only (approximately) optimal for linear demand, but also learns fast and performs well when the linearity assumption is violated, for example in the cases of two natural valuation distributions, exponential and log-normal.