The Design and Implementation of a Database For Human Genome Research (Position Paper)

  • Authors:
  • Rob Sargent;Dave Fuhrman;Terence Critchlow;Tony Di Sera;Robert Mecklenburg;Gary Lindstrom;Peter Cartwright

  • Affiliations:
  • -;-;-;-;-;-;-

  • Venue:
  • SSDBM '96 Proceedings of the Eighth International Conference on Scientific and Statistical Database Management
  • Year:
  • 1996

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Human Genome Project poses severe challenges in database design and implementation. These include comprehensive coverage of diverse data domains and user constituencies; robustness in the presence of incomplete, inconsistent and multi-version data; accessibility through many levels of abstraction, and scalability in content and organizational complexity. This paper presents a new data model developed to meet these challenges by the Utah Center for Human Genome Research. The central characteristics of this data model are (i) a high level data model comprising five broadly applicable workflow notions; (ii) representation of those notions as objects in an extended relational model; (iii) expression of working database schemas as meta data in administration tables; (iv) population of the database through tables dependent on the meta data tables, and (v) implementation via a conventional relational database management system. We explore two advantages of this approach: the resulting representational flexibility, and the reflective use of meta data to accomplish schema evolution by ordinary updates. Implementation and performance pragmatics of this work are sketched, as well as implications for future database development.