Protein simulation data in the relational model

Authors:
Andrew M. Simms;Valerie Daggett
Affiliations:
Biomedical and Health Informatics Program, University of Washington, Seattle, USA 98195-5013;Biomedical and Health Informatics Program, University of Washington, Seattle, USA 98195-5013 and Bioengineering, University of Washington, Seattle, USA 98195-5013
Venue:
The Journal of Supercomputing
Year:
2012

Citing 3
Cited 0

A relational model of data for large shared data banks

Communications of the ACM
SQL Server 2008 Query Performance Tuning Distilled

SQL Server 2008 Query Performance Tuning Distilled
Generation of a consensus protein domain dictionary

Bioinformatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

High performance computing is leading to unprecedented volumes of data. Relational databases offer a robust and scalable model for storing and analyzing scientific data. However, these features do not come without a cost--significant design effort is required to build a functional and efficient repository. Modeling protein simulation data in a relational database presents several challenges: The data captured from individual simulations are large, multidimensional, and must integrate with both simulation software and external data sites. Here, we present the dimensional design and relational implementation of a comprehensive data warehouse for storing and analyzing molecular dynamics simulations using SQL Server.