Traffic quality based pricing in paid search using two-stage regression

  • Authors:
  • Rouben Amirbekian;Ye Chen;Alan Lu;Tak W. Yan;Liangzhong Yin

  • Affiliations:
  • Microsoft Corporation, Sunnyvale, CA, USA;Microsoft Corporation, Sunnyvale, CA, USA;Microsoft Corporation, Sunnyvale, CA, USA;Microsoft Corporation, Sunnyvale, CA, USA;Microsoft Corporation, Sunnyvale, CA, USA

  • Venue:
  • Proceedings of the 22nd international conference on World Wide Web companion
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

While the cost-per-click (CPC) pricing model is main stream in sponsored search, the quality of clicks with respect to conversion rates and hence their values to advertisers may vary considerably from publisher to publisher in a large syndication network. Traffic quality shall be used to establish price discounts for clicks from different publishers. These discounts are intended to maintain incentives for high-quality online traffic and to make it easier for advertisers to maintain long-term bid stability. Conversion signal is noisy as each advertiser defines conversion in their own way. It is also very sparse. Traditional way of overcoming signal sparseness is to allow for longer time in accumulating modeling data. However, due to fast-changing conversion trends, such longer time leads to deterioration of the precision in measuring quality. To allow models to adjust to fast-changing trends with sufficient speed, we had to limit time-window for conversion data collection and make it much shorter than the several weeks window commonly used. Such shorter time makes conversions in the training set extremely sparse. To overcome resulting obstacles, we used two-stage regression similar to hurdle regression. First we employed logistic regression to predict zero conversion outcomes. Next, conditioned on non-zero outcomes, we used random forest regression to predict the value of the quotient of two conversion rates. Two-stage model accounts for the zero inflation due to the sparseness of the conversion signal. The combined model maintains good precision and allows faster reaction to the temporal changes in traffic quality including changes due to certain actions by publishers that may lead to click-price inflation.