An adaptive multi-policy grid service for biological sequence comparison

  • Authors:
  • Marcelo S. Sousa;Alba C. M. A. Melo;Azzedine Boukerche

  • Affiliations:
  • University of Brasilia (UnB), Department of Computer Science, Campus UNB - ICC-Norte - sub-solo, 70910-900 Brasilia-DF, Brazil;University of Brasilia (UnB), Department of Computer Science, Campus UNB - ICC-Norte - sub-solo, 70910-900 Brasilia-DF, Brazil and University of Ottawa (SITE), PARADISE Research Laboratory, 800 Ki ...;University of Ottawa (SITE), PARADISE Research Laboratory, 800 King Edwards, K1N 6N5 Ottawa, Ontario, Canada

  • Venue:
  • Journal of Parallel and Distributed Computing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the last decade, we have observed an unprecedented development in molecular biology. An extremely high number of organisms have been sequenced in genome projects and included in genomic databases, for further analysis. These databases present an exponential growth rate and they are intensively accessed daily, all over the world. Once a sequence is obtained, its function and/or structure must be determined. Direct experimentation is considered to be the most reliable method to do that. However, the experiments that must be conducted are very complex and time consuming. For this reason, it is far more productive to use computational methods to infer biological information from a sequence. This is usually done by comparing the new sequence with sequences that already had their characteristics determined. BLAST is the most widely used heuristic tool for sequence comparison. Thousands of BLAST searches are made daily, all over the world. In order to further reduce the BLAST execution time, cluster and grid environments can be effectively used. This paper proposes and evaluates an adaptive task allocation framework to perform BLAST searches in a grid environment. The framework, called PackageBLAST, provides an infrastructure that executes distributed BLAST genomic database comparisons. In addition, it is flexible since the user can choose or incorporate new task allocation strategies. Furthermore, we propose a mechanism to compute grid nodes' execution weight, adapting the chosen allocation policy to the observed computational power and local load of the nodes. Our results present very good speedups. For instance, in a 16-machine heterogeneous grid testbed, a speedup of 14.59 was achieved, reducing the BLAST execution time from 30.88 min to 2.11 min. Also, we show that the adaptive task allocation strategy was able to handle successfully the complexity of a grid environment.