Characterization of national Web domains

  • Authors:
  • Ricardo Baeza-Yates;Carlos Castillo;Efthimis N. Efthimiadis

  • Affiliations:
  • Yahoo! Research;Cátedra Telefónica, Universitat Pompeu Fabra;University of Washington

  • Venue:
  • ACM Transactions on Internet Technology (TOIT)
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

During the last few years, several studies on the characterization of the public Web space of various national domains have been published. The pages of a country are an interesting set for studying the characteristics of the Web because at the same time these are diverse (as they are written by several authors) and yet rather similar (as they share a common geographical, historical and cultural context). This article discusses the methodologies used for presenting the results of Web characterization studies, including the granularity at which different aspects are presented, and a separation of concerns between contents, links, and technologies. Based on this, we present a side-by-side comparison of the results of 12 Web characterization studies, comprising over 120 million pages from 24 countries. The comparison unveils similarities and differences between the collections and sheds light on how certain results of a single Web characterization study on a sample may be valid in the context of the full Web.