Web Site Metadata

  • Authors:
  • Erik Wilde;Anuradha Roy

  • Affiliations:
  • School of Information, UC Berkeley,;School of Information, UC Berkeley,

  • Venue:
  • ICWE '9 Proceedings of the 9th International Conference on Web Engineering
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Understanding the availability of site metadata on the Web is a foundation for any system or application that wants to work with the pages published by Web sites, and also wants to understand a Web site's structure. There is little information available about how much information Web sites make available about themselves, and this paper presents data addressing this question. Based on this analysis of available Web site metadata, it is easier for Web-oriented applications to be based on statistical analysis rather than assumptions when relying on Web site metadata. Our study of robots.txt files and sitemaps can be used as a starting point for Web-oriented applications wishing to work with Web site metadata.