Managing versions of web documents in a transaction-time web server

  • Authors:
  • Curtis E. Dyreson;Hui-ling Lin;Yingxia Wang

  • Affiliations:
  • Washington State University, Pullman, WA;Washington State University, Pullman, WA;Washington State University, Pullman, WA

  • Venue:
  • Proceedings of the 13th international conference on World Wide Web
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a transaction-time HTTP server, called TTApache that supports document versioning. A document often consists of a main file formatted in HTML or XML and several included files such as images and stylesheets. A change to any of the files associated with a document creates a new version of that document. To construct a document version history, snapshots of the document's files are obtained over time. Transaction times are associated with each file version to record the version's lifetime. The transaction time is the system time of the edit that created the version. Accounting for transaction time is essential to supporting audit queries that delve into past document versions and differential queries that pinpoint differences between two versions. TTApache performs automatic versioning when a document is read thereby removing the burden of versioning from document authors. Since some versions may be created but never read, TTApache distinguishes between known and assumed versions of a document. TTApache has a simple query language to retrieve desired versions. A browser can request a specific version, or the entire history of a document. Queries can also rewrite links and references to point to current or past versions. Over time, the version history of a document continually grows. To free space, some versions can be vacuumed. Vacuuming a version however changes the semantics of requests for that version. This paper presents several policies for vacuuming versions and strategies for accounting for vacuumed versions in queries.