Genomic data modeling

  • Authors:
  • Jake Yue Chen;John V. Carlis

  • Affiliations:
  • Myriad Proteomics, Inc., 2150 West Dauntlen Avenue, Salt Lake City, UT;Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN

  • Venue:
  • Information Systems - Special issue: Data management in bioinformatics
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Researchers face many challenges in representing biological data, including: (1) inherent complexity of biological data, (2) domain knowledge barrier, (3) constantly evolving knowledge, and (4) lack of expert data-modeling skills. We have studied how to represent biological sequences and sequence-related genomics concepts using logical data structure. From our multiple experiences in genomic data modeling, we present results in three areas: genomic schema elements, genomic schema fragments, and genomic data modeling lessons. A genomic schema element is a data model that contains only one basic biological sequence notion. Genomic schema elements provide biology data modelers with baseline thoughts in genomic data modeling. A genomic schema fragment is a data model that contains only one genomic topic area. Genomic schema fragments provide biology data modelers with successful design solutions that they can adapt to fit their own problem's needs. Genomic data modeling lessons address issues particularly important to genomic data modeling such as modeling contextual information, modeling intermediate and derived data, modeling inconsistent data, and modeling categorical rules. Genomic data modeling lessons provide novice biology data modelers with enriched principles from content-neutral data modeling techniques. In all, we have demonstrated how to manage evolving genomic knowledge concepts and discovery results using data modeling techniques extended into the genomics domain.