Archival HTTP redirection retrieval policies

  • Authors:
  • Ahmed AlSum;Michael L. Nelson;Robert Sanderson;Herbert Van de Sompel

  • Affiliations:
  • Old Dominion University, Norfolk, VA, USA;Old Dominion University, Norfolk, VA, USA;Los Alamos National Laboratory, Los Alamos, NM, USA;Los Alamos National Laboratory, Los Alamos, NM, USA

  • Venue:
  • Proceedings of the 22nd international conference on World Wide Web companion
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

When retrieving archived copies of web resources (mementos) from web archives, the original resource's URI-R is typically used as the lookup key in the web archive. This is straightforward until the resource on the live web issues a redirect: R -R`. Then it is not clear if R or R` should be used as the lookup key to the web archive. In this paper, we report on a quantitative study to evaluate a set of policies to help the client discover the correct memento when faced with redirection. We studied the stability of 10,000 resources and found that 48% of the sample URIs tested were not stable, with respect to their status and redirection location. 27% of the resources were not perfectly reliable in terms of the number of mementos of successful responses over the total number of mementos, and 2% had a reliability score of less than 0.5. We tested two retrieval policies. The first policy covered the resources which currently issue redirects and successfully resolved 17 out of 77 URIs that did not have mementos of the original URI, but did of the resource that was being redirected to. The second policy covered archived copies with HTTP redirection and helped the client in 58% of the cases tested to discover the nearest memento to the requested datetime.