Grid Deployment of Legacy Bioinformatics Applications with Transparent Data Access

Authors:
Christophe Blanchet;Remi Mollon;Douglas Thain;Gilbert Deleage
Affiliations:
Institut de Biologie et Chimie des Prot?es, IBCP UMR 5086/ CNRS/ Univ. Lyon1/ IFR128 BioSciences Lyon-Gerland/ 7, passage du Vercors, 69367 Lyon cedex 07, France. Christophe.Blanch;Institut de Biologie et Chimie des Prot?es, IBCP UMR 5086/ CNRS/ Univ. Lyon1/ IFR128 BioSciences Lyon-Gerland/ 7, passage du Vercors, 69367 Lyon cedex 07, France. dthain@cse.nd.edu;Department of Computer Science and Engineering, University of Notre Dame, 384 Fitzpatrick Hall, Notre Dame, Indiana, United States. Remi.Mollon@ibcp.fr;Institut de Biologie et Chimie des Prot?es, IBCP UMR 5086/ CNRS/ Univ. Lyon1/ IFR128 BioSciences Lyon-Gerland/ 7, passage du Vercors, 69367 Lyon cedex 07, France. Gilbert.Deleage@i
Venue:
GRID '06 Proceedings of the 7th IEEE/ACM International Conference on Grid Computing
Year:
2006

Citing 10
Cited 1

The internet backplane protocol: a study in resource sharing

Future Generation Computer Systems - Selected papers from CCGRID 2002
The SDSC storage resource broker

CASCON '98 Proceedings of the 1998 conference of the Centre for Advanced Studies on Collaborative research
The Grid 2: Blueprint for a New Computing Infrastructure

The Grid 2: Blueprint for a New Computing Infrastructure
Transparent access to multiple bioinformatics information sources

IBM Systems Journal - Deep computing for the life sciences
Grid as a bioinformatic tool

Parallel Computing - Special issue: High-performance parallel bio-computing
Distributed computing in practice: the Condor experience: Research Articles

Concurrency and Computation: Practice & Experience - Grid Performance
GLARE: A Grid Activity Registration, Deployment and Provisioning Framework

SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Separating Abstractions from Resources in a Tactical Storage System

SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Grid technology for biomedical applications

VECPAR'04 Proceedings of the 6th international conference on High Performance Computing for Computational Science
Simultaneous scheduling of replication and computation for bioinformatic applications on the grid

ISBMDA'05 Proceedings of the 6th International conference on Biological and Medical Data Analysis

Responsive elastic computing

GMAC '09 Proceedings of the 6th international conference industry session on Grids meets autonomic computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Although grid computing offers great potential for executing large-scale bioinformatics applications, practical deployment is constrained by legacy interfaces. Most widely deployed bioinformatics were designed long before grid computing arose, and thus are created, tested, and validated in the familiar environment of a workstation. Most perform simple local I/O and have no facility for interfacing with a distributed system. Because of these limitations, users of bioinformatics applications are generally constrained to creating large local clustered systems in order to perform data analysis. In order to deploy these applications in wide-area grid systems, users require a transparent mechanism of attaching legacy interfaces to grid I/O systems. We have explored this problem by deploying several bioinformatics databases and programs for protein sequence analysis on the European EGEE grid. Using tools for transparent adaptation, we have connected legacy applications to the logical namespace provided by a replica manager, and compared the performance of remote access versus file staging. For common bioinformatics applications, we find that remote access has performance equal or better than simple file staging, with the added advantage that users are freed from stating the data needs of applications in advance.