404 not found: the stability and persistence of URLs published in MEDLINE

  • Authors:
  • Jonathan D. Wren

  • Affiliations:
  • Advanced Center for Genome Technology, Department of Botany and Microbiology, The University of Oklahoma, 620 Parrington Oval Rm. 106, Norman, OK 73019, USA

  • Venue:
  • Bioinformatics
  • Year:
  • 2004

Quantified Score

Hi-index 3.84

Visualization

Abstract

Motivation: The advent of the World Wide Web has enabled unprecedented supplementation of traditional journal publications, allowing access to resources, such as video, sound, software, databases, datasets too large to publish, and even supplementary information and discussion. However, unlike traditional publications, continued availability of these online resources is not guaranteed. An automated survey was conducted to quantify the growth in Uniform Resource Locators (URLs) published to date in MEDLINE abstracts, their current availability and distribution by journal. Results: Of 1630 unique URLs identified, formatting and/or spelling errors were detected within 201 (12%) of them as published. After corrections were made, a survey revealed that ∼63% of these URLs were consistently available, and another 19% were available intermittently. The rate of failure was far worse for anonymous login to FTP sites, with only 12 of 33 sites (36%) responding. This survey also shows that journals vary disproportionately in the number of web citations published, suggesting policy implementation among a few could have a profound impact overall. Out of the 306 journals with a URL published in an abstract, Bioinformatics published the most (12% of total). Availability: URL database and program available by request.