Preservation of digital data with self-validating, self-instantiating knowledge-based archives

  • Authors:
  • Bertram Ludäscher;Richard Marciano;Reagan Moore

  • Affiliations:
  • San Diego Supercomputer Center, U.C. San Diego;San Diego Supercomputer Center, U.C. San Diego;San Diego Supercomputer Center, U.C. San Diego

  • Venue:
  • ACM SIGMOD Record
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Digital archives are dedicated to the long-term preservation of electronic information and have the mandate to enable sustained access despite rapid technology changes. Persistent archives are confronted with heterogeneous data formats, helper applications, and platforms being used over the lifetime of the archive. This is not unlike the interoperability challenges, for which mediators are devised. To prevent technological obsolescence over time and across platforms, a migration approach for persistent archives is proposed based on an XML infrastructure.We extend current archival approaches that build upon standardized data formats and simple metadata mechanisms for collection management, by involving high-level conceptual models and knowledge representations as an integral part of the archive and the ingestion/migration processes. Infrastructure independence is maximized by archiving generic, executable specifications of (i) archival constraints (i.e., "model validators"), and (ii) archival transformations that are part of the ingestion process. The proposed architecture facilitates construction of self-validating and self-instantiating knowledge-based archives. We illustrate our overall approach and report on first experiences using a sample collection from a collaboration with the National Archives and Records Administration (NARA).