Managing distributed collections: evaluating web page changes, movement, and replacement

  • Authors:
  • Zubin Dalal;Suvendu Dash;Pratik Dave;Luis Francisco-Revilla;Richard Furuta;Unmil Karadkar;Frank Shipman

  • Affiliations:
  • Texas A&M University, College Station, TX;Texas A&M University, College Station, TX;Texas A&M University, College Station, TX;Texas A&M University, College Station, TX;Texas A&M University, College Station, TX;Texas A&M University, College Station, TX;Texas A&M University, College Station, TX

  • Venue:
  • Proceedings of the 4th ACM/IEEE-CS joint conference on Digital libraries
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Distributed collections of Web materials are common. Bookmark lists, paths, and catalogs such as Yahoo! Directories require human maintenance to keep up to date with changes to the underlying documents. The Walden's Paths Path Manager is a tool to support the maintenance of distributed collections. Earlier efforts focused on recognizing the type and degree of change within Web pages and identifying pages no longer accessible. We now extend this work with algorithms for evaluating drastic changes to page content based on context. Additionally, we expand on previous work to locate moved pages and apply the modified approach to suggesting page replacements when the original page cannot be found Based on these results we are redesigning the Path Manager to better support the range of assessments necessary to manage distributed collections.