RepWeb: replicated Web with referential integrity

  • Authors:
  • Luís Veiga;Paulo Ferreira

  • Affiliations:
  • INESC-ID Lisboa/IST, Rua Alves Redol, 9, Lisboa, Portugal;INESC-ID Lisboa/IST, Rua Alves Redol, 9, Lisboa, Portugal

  • Venue:
  • Proceedings of the 2003 ACM symposium on Applied computing
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Replication of web content, through mirroring of web sites or browsing off-line content, is one of the most used techniques to increase content availability, reduce network bandwidth usage and minimize browsing delays in the world-wide-web.The world-wide-web does not support referential integrity, i.e., broken links do exist. This has been considered, for some years now, one of the most serious problems of the web. This is true in various fields, e.g.: i) if a user pays for some service in the form of web pages, he requires such pages to be reachable all the time, and ii) archived web resources, either scientific, legal or historic, that are still referenced, need to be preserved and remain available.Current approaches to the broken-link problem are not able to preserve referential integrity on the web and, simultaneously, support replication and minimize storage waste due to memory leaks. Some of them also impose specific authoring and management systems. Thus, the limitations of current systems reside in three issues: transparency, completeness and safety.We propose a system, RepWeb, comprised of an application to access and manage replicated web content and an implementation of an acyclic distributed garbage collection algorithm for wide-area replicated memory, that satisfies all these requirements. It supports replication, enforces referential integrity on the web and minimizes storage waste.