Web content cartography

  • Authors:
  • Bernhard Ager;Wolfgang Mühlbauer;Georgios Smaragdakis;Steve Uhlig

  • Affiliations:
  • TU Berlin & T-Labs, Berlin, Germany;ETH Zürich, Zürich, Switzerland;TU Berlin & T-Labs, Berlin, Germany;TU Berlin & T-Labs, Berlin, Germany

  • Venue:
  • Proceedings of the 2011 ACM SIGCOMM conference on Internet measurement conference
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recent studies show that a significant part of Internet traffic is delivered through Web-based applications. To cope with the increasing demand for Web content, large scale content hosting and delivery infrastructures, such as data-centers and content distribution networks, are continuously being deployed. Being able to identify and classify such hosting infrastructures is helpful not only to content producers, content providers, and ISPs, but also to the research community at large. For example, to quantify the degree of hosting infrastructure deployment in the Internet or the replication of Web content. In this paper, we introduce Web Content Cartography, i.e., the identification and classification of content hosting and delivery infrastructures. We propose a lightweight and fully automated approach to discover hosting infrastructures based only on DNS measurements and BGP routing table snapshots. Our experimental results show that our approach is feasible even with a limited number of well-distributed vantage points. We find that some popular content is served exclusively from specific regions and ASes. Furthermore, our classification enables us to derive content-centric AS rankings that complement existing AS rankings and shed light on recent observations about shifts in inter-domain traffic and the AS topology.