Data-driven text features for sponsored search click prediction

  • Authors:
  • Benyah Shaparenko;Özgür Çetin;Rukmini Iyer

  • Affiliations:
  • Cornell University, Ithaca, NY;Yahoo! Labs, New York, NY;Yahoo! Labs, Santa Clara, CA

  • Venue:
  • Proceedings of the Third International Workshop on Data Mining and Audience Intelligence for Advertising
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

Much search engine revenue comes from sponsored search ads displayed with algorithmic search results. To maximize revenue, it is essential to choose a good slate of ads for each query, requiring accurate prediction of whether or not users will click on an ad. Click prediction is relatively easy for the ads that have been displayed many times, and have significant click history, but in the long tail with minimal or no click history, other features are needed to predict user response. In this work, we investigate the use of novel text features for this problem, within the context of a state-of-the-art sponsored search system. In particular, we propose the use of detailed word-pair indicator features between the query and ad. We compare the new features to the traditional vector-space and language modeling features extracted in a typical information-retrieval style. We evaluate these approaches in a maximum-entropy ranking model using the click-view data from a commercial search-engine traffic. We show that the word-pair features are highly helpful for sponsored search click prediction, not only improving over the sophisticated click-history feedback based systems, but also compensating for the lack of click history to some extent. In contrast, we find that the language and vector-space modeling approaches are significantly less effective.