A survey of web archive search architectures

  • Authors:
  • Miguel Costa;Daniel Gomes;Francisco Couto;Mário Silva

  • Affiliations:
  • Foundation for National Scientific Computing & University of Lisbon, Lisbon, Portugal;Foundation for National Scientific Computing, Lisbon, Portugal;LaSIGE, Lisbon, Portugal;IST/INESC-ID, Lisbon, Portugal

  • Venue:
  • Proceedings of the 22nd international conference on World Wide Web companion
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Web archives already hold more than 282 billion documents and users demand full-text search to explore this historical information. This survey provides an overview of web archive search architectures designed for time-travel search, i.e. full-text search on the web within a user-specified time interval. Performance, scalability and ease of management are important aspects to take in consideration when choosing a system architecture. We compare these aspects and initialize the discussion of which search architecture is more suitable for a large-scale web archive.