The SPIRIT collection: an overview of a large web collection

  • Authors:
  • Hideo Joho;Mark Sanderson

  • Affiliations:
  • University of Sheffield;University of Sheffield

  • Venue:
  • ACM SIGIR Forum
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

A large scale collection of web pages has been essential for research in information retrieval and related areas. This paper provides an overview of a large web collection used in the SPIRIT project for the design and testing of spatially-aware retrieval systems. Several statistics are derived and presented to show the characteristics of the collection.