Web Page Summarization for Just-in-Time Contextual Advertising

  • Authors:
  • Aris Anagnostopoulos;Andrei Z. Broder;Evgeniy Gabrilovich;Vanja Josifovski;Lance Riedel

  • Affiliations:
  • Sapienza University of Rome;Yahoo! Research;Yahoo! Research;Yahoo! Research;Yahoo! Research

  • Venue:
  • ACM Transactions on Intelligent Systems and Technology (TIST)
  • Year:
  • 2011

Quantified Score

Hi-index 0.01

Visualization

Abstract

Contextual advertising is a type of Web advertising, which, given the URL of a Web page, aims to embed into the page the most relevant textual ads available. For static pages that are displayed repeatedly, the matching of ads can be based on prior analysis of their entire content; however, often ads need to be matched to new or dynamically created pages that cannot be processed ahead of time. Analyzing the entire content of such pages on-the-fly entails prohibitive communication and latency costs. To solve the three-horned dilemma of either low relevance or high latency or high load, we propose to use text summarization techniques paired with external knowledge (exogenous to the page) to craft short page summaries in real time. Empirical evaluation proves that matching ads on the basis of such summaries does not sacrifice relevance, and is competitive with matching based on the entire page content. Specifically, we found that analyzing a carefully selected 6% fraction of the page text can sacrifice only 1%--3% in ad relevance. Furthermore, our summaries are fully compatible with the standard JavaScript mechanisms used for ad placement: they can be produced at ad-display time by simple additions to the usual script, and they only add 500--600 bytes to the usual request. We also compared our summarization approach, which is based on structural properties of the HTML content of the page, with a more principled one based on one of the standard text summarization tools (MEAD), and found their performance to be comparable.