XArch: archiving scientific and reference data

  • Authors:
  • Heiko Müller;Peter Buneman;Ioannis Koltsidas

  • Affiliations:
  • University of Edinburgh, Edinburgh, United Kngdm;University of Edinburgh, Edinburgh, United Kngdm;University of Edinburgh, Edinburgh, United Kngdm

  • Venue:
  • Proceedings of the 2008 ACM SIGMOD international conference on Management of data
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Database archiving is important for the retrieval of old versions of a database and for temporal queries over the history of data. We demonstrate XArch, a management system for maintaining, populating, and querying archives of hierarchical data. XArch is based on a nested merge approach that efficiently stores multiple versions of hierarchical data in a compact archive. By merging elements into one data structure, any specific version is retrievable from the archive in a single pass over the data and efficient tracking of object history is possible. XArch implements this approach and extends it in two important ways. First, in order to merge large hierarchical data sets, elements need to be sorted according to their key values. We developed an efficient algorithm for sorting hierarchical data in secondary storage and modified the nested merge algorithm accordingly. Second, we designed and implemented a declarative query language that enables one both to view data from particular versions and to track the history of objects. We demonstrate this using both molecular biology and demographic reference data as examples.