Grid Approach to Embarrassingly Parallel CPU-Intensive Bioinformatics Problems

  • Authors:
  • Heinz Stockinger;Marco Pagni;Lorenzo Cerutti;Laurent Falquet

  • Affiliations:
  • Swiss Institute of Bioinformatics, Vital-IT, Switzerland;Swiss Institute of Bioinformatics, Vital-IT, Switzerland;Swiss Institute of Bioinformatics, Vital-IT, Switzerland;Swiss Institute of Bioinformatics, Vital-IT, Switzerland

  • Venue:
  • E-SCIENCE '06 Proceedings of the Second IEEE International Conference on e-Science and Grid Computing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Bioinformatics algorithms such as sequence alignment methods based on profile-HMM (Hidden Markov Model) are popular but CPU-intensive. If large amounts of data are processed, a single computer often runs for many hours or even days. High performance infrastructures such as clusters or computational Grids provide the techniques to speed up the process by distributing the workload to remote nodes, running parts of the work load in parallel. Biologists often do not have access to such hardware systems. Therefore, we propose a new system using a modern Grid approach to optimise an embarrassingly parallel problem. We achieve speed ups by at least two orders of magnitude given that we can use a powerful, world-wide distributed Grid infrastructure. For large-scale problems our method can outperform algorithms designed for mid-size clusters even considering additional latencies imposed by Grid infrastructures.