Graph data management for molecular and cell biology

Authors:
B. A. Eckman;P. G. Brown
Affiliations:
-;-
Venue:
IBM Journal of Research and Development - Systems biology
Year:
2006

Citing 4
Cited 3

A federated architecture for information management

ACM Transactions on Information Systems (TOIS)
Moments and points in an interval-based temporal logic

Computational Intelligence
Object-Relational DBMSs: Tracking the Next Great Wave

Object-Relational DBMSs: Tracking the Next Great Wave
DiscoveryLink: a system for integrated access to life sciences data sources

IBM Systems Journal - Deep computing for the life sciences

How to authenticate graphs without leaking

Proceedings of the 13th International Conference on Extending Database Technology
Relational operators for prioritizing candidate biomarkers in high-throughput differential expression data

Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology
Using semantic web tools to integrate experimental measurement data on our own terms

OTM'06 Proceedings of the 2006 international conference on On the Move to Meaningful Internet Systems: AWeSOMe, CAMS, COMINF, IS, KSinBIT, MIOS-CIAO, MONET - Volume Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

As high-throughput biology begins to generate large volumes of systems biology data, the need grows for robust, efficient database systems to support investigations of metabolic and signaling pathways, chemical reaction networks, gene regulatory networks, and protein interaction networks. Network data is frequently represented as graphs, and researchers need to navigate, query and manipulate this data in ways that are not well supported by standard relational database management systems (RDBMSs). Current approaches to managing graphs in an RDBMS rely on either external procedural logic to execute the graph algorithms or clumsy and inefficient algorithms implemented in Structured Query Language (SQL). In this paper we describe the Systems Biology Graph Extender, a research prototype that extends the IBM RDBMS--DB2® Universal Database software--with graph objects and operations to support declarative SQL queries over biological networks and other graph structures. Supported operations include neighborhood queries, shortest path queries, spanning trees, graph transposition, and graph matching. In a federated database environment, graph operations may be applied to data stored in any format, whether remote or local, relational or nonrelational. A single federated query may include both graph-based predicates and predicates over related data sources, such as microarray expression levels, clinical prognosis and outcome, or the function of orthologous proteins (i.e., proteins that are evolutionarily related to those in another species) in mouse disease models.