Learning dynamic prices in MultiSeller electronic retail markets with price sensitive customers, stochastic demands, and inventory replenishments

Authors:
V. L.R. Chinthalapati;N. Yadati;R. Karumanchi
Affiliations:
Dept. of Math., London Sch. of Econ. & Political Sci.;-;-
Venue:
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Year:
2006

Citing 0
Cited 5

An analysis of privacy signals on the World Wide Web: Past, present and future

Information Sciences: an International Journal
Reducing the probability of bankruptcy through supply chain coordination

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
An adaptive Q-learning algorithm developed for agent-based computational modeling of electricity market

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
A design model for knowledge-based pricing services in the retail industry

International Journal of Web Engineering and Technology
A design model for knowledge-based pricing services in the retail industry

International Journal of Web Engineering and Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we use reinforcement learning (RL) as a tool to study price dynamics in an electronic retail market consisting of two competing sellers, and price sensitive and lead time sensitive customers. Sellers, offering identical products, compete on price to satisfy stochastically arriving demands (customers), and follow standard inventory control and replenishment policies to manage their inventories. In such a generalized setting, RL techniques have not previously been applied. We consider two representative cases: 1) no information case, were none of the sellers has any information about customer queue levels, inventory levels, or prices at the competitors; and 2) partial information case, where every seller has information about the customer queue levels and inventory levels of the competitors. Sellers employ automated pricing agents, or pricebots, which use RL-based pricing algorithms to reset the prices at random intervals based on factors such as number of back orders, inventory levels, and replenishment lead times, with the objective of maximizing discounted cumulative profit. In the no information case, we show that a seller who uses Q-learning outperforms a seller who uses derivative following (DF). In the partial information case, we model the problem as a Markovian game and use actor-critic based RL to learn dynamic prices. We believe our approach to solving these problems is a new and promising way of setting dynamic prices in multiseller environments with stochastic demands, price sensitive customers, and inventory replenishments