Managing and querying transaction-time databases under schema evolution

  • Authors:
  • Hyun J. Moon;Carlo A. Curino;Alin Deutsch;Chien-Yi Hou;Carlo Zaniolo

  • Affiliations:
  • UCLA;Politecnico di Milano;UC San Diego;UC San Diego;UCLA

  • Venue:
  • Proceedings of the VLDB Endowment
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

The old problem of managing the history of database information is now made more urgent and complex by fast spreading web information systems, such as Wikipedia. Our PRIMA system addresses this difficult problem by introducing two key pieces of new technology. The first is a method for publishing the history of a relational database in XML, whereby the evolution of the schema and its underlying database are given a unified representation. This temporally grouped representation makes it easy to formulate sophisticated historical queries on any given schema version using standard XQuery. The second key piece of technology is that schema evolution is transparent to the user: she writes queries against the current schema while retrieving the data from one or more schema versions. The system then performs the labor-intensive and error-prone task of rewriting such queries into equivalent ones for the appropriate versions of the schema. This feature is particularly important for historical queries spanning over potentially hundreds of different schema versions and it is realized in PRIMA by (i) introducing Schema Modification Operators (SMOs) to represent the mappings between successive schema versions and (ii) an XML integrity constraint language (XIC) to efficiently rewrite the queries using the constraints established by the SMOs. The scalability of the approach has been tested against both synthetic data and real-world data from the Wikipedia DB schema evolution history.