Learning website hierarchies for keyword enrichment in contextual advertising

  • Authors:
  • Pavan Kumar GM;Krishna P. Leela;Mehul Parsana;Sachin Garg

  • Affiliations:
  • Yahoo! Labs, Bangalore, India;Microsoft adCenter, Bangalore, India;Microsoft adCenter, Bangalore, India;Yahoo! Labs, Bangalore, India

  • Venue:
  • Proceedings of the fourth ACM international conference on Web search and data mining
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In Contextual advertising, textual ads relevant to the content in a webpage are embedded in the page. Content keywords are extracted offline by crawling webpages and then stored in an index for fast serving. Given a page, ad selection involves index lookup, computing similarity between the keywords of the page and those of candidate ads and returning the top-k scoring ads. In this approach, ad relevance can suffer in two scenarios. First, since page-ad similarity is computed using keywords extracted only from that particular page, a few non pertinent keywords can skew ad selection. Second, requesting page may not be present in the index but we still need to serve relevant ads. We propose a novel mechanism to mitigate these problems in the same framework. The basic idea is to enrich keywords of a particular page with keywords from other but "similar" pages. The scheme involves learning a website specific hierarchy from (page, URL) pairs of the website. Next, keywords are populated on the nodes via successive top-down and bottom-up iterations over the hierarchy. We evaluate our approach on three data sets, one small human labeled set and two large-scale sets from Yahoo's contextual advertising system. Empirical evaluation show that ads fetched by enriching keywords has 2-3% higher nDCG compared to ads fetched based on a recent semantic approach even though the index size of our approach is 7 times less than the index size of semantic approach. Evaluation over pages which are not present in the index shows that ads fetched by our method has 6-7% higher nDCG compared to ads fetched based on a recent approach which uses first N bytes of the page content. Scalability is demonstrated via map-reduce adoption of our method and training on a large data set of 220 million pages from 95,104 websites.