Hyper-BLAST: a parallelized BLAST on cluster system

  • Authors:
  • Hong-Soog Kim;Hae-Jin Kim;Dong-Soo Han

  • Affiliations:
  • School of Engineering, Information and Communications University, Daejeon, Korea;School of Engineering, Information and Communications University, Daejeon, Korea;School of Engineering, Information and Communications University, Daejeon, Korea

  • Venue:
  • ICCS'03 Proceedings of the 2003 international conference on Computational science: PartIII
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

BLAST is an important tool in bioinformatics. It has been used to find biologically similar sequences to the given query sequence from the database of the annotated sequences. For high throughput processing of huge number of query sequences, there have been many studies on parallel batch processing of sequence similarity search using BLAST. As the number of sequences in the database increases at exponential rate, the search speed of BLAST itself becomes important. Although NCBI has developed a parallel BLAST using the thread on SMP machines for the speedup of BLAST, the speedup is still limited because the SMP machine has restricted the number of processors due to its architecture. In this paper, we present our parallelized BLAST on cluster systems for further speedup. The main strategy used is the exploitation of the inter-node parallelism, which can be extracted by logical partitioning of the database. For the inter-node parallelism, we have designed and implemented a logical database partitioning method, initiation and coordination of the BLAST on remote node and communication protocol for collecting remote node's result. According to our performance test with 2-way 8 node cluster system, roughly 12 times speedup has been achieved in terms of response time of similarity search for individual query sequence.