A linear-time scheme for version reconstruction

  • Authors:
  • Lin Yu;Daniel J. Rosenkrantz

  • Affiliations:
  • State Univ. of New York, Albany;State Univ. of New York, Albany

  • Venue:
  • ACM Transactions on Programming Languages and Systems (TOPLAS)
  • Year:
  • 1994

Quantified Score

Hi-index 0.00

Visualization

Abstract

An efficient scheme to store and reconstruct versions of sequential files is presented. The reconstruction scheme involves building a data structure representing a complete version, and then successively modifying this data structure by applying a sequence of specially formatted differential files to it. Each application of a differential file produces a representation of an intermediate version, with the final data structure representing the requested version.The scheme uses a linked list to represent an intermediate version, instead of a sequential array, as is used traditionally. A new format for differential files specifying changes to this linked list data structure is presented. The specification of each change points directly to where the change is to take place, thereby obviating a search. Algorithms are presented for using such a new format differential file to transform the representation of a version, and for reconstructing a requested version. Algorithms are also presented for generating the new format differential files, both for the case of a forward differential specifying how to transform the representation of an old version to the representation of a new version, and for the case of a reverse differential specifying how to transform the representation of a new version to the representation of an old version.The new version reconstruction scheme takes time linear in the sum of the size of the initial complete version and the sizes of the file differences involved in reconstructing the requested version. In contrast, the classical scheme for reconstructing versions takes time proportional to the sum of the sizes of the sequence of versions involved in the reconstruction, and therefore has a worst-case time that is quadratic in the sum of the size of the initial complete version and the sizes of the file differences. The time cost of the new differential file generation scheme is comparable to the time cost of the classical differential file generation scheme.