Optimizing matrix multiply using PHiPAC: a portable, high-performance, ANSI C coding methodology
ICS '97 Proceedings of the 11th international conference on Supercomputing
MPI-The Complete Reference, Volume 1: The MPI Core
MPI-The Complete Reference, Volume 1: The MPI Core
Exploiting Coarse-Grained Parallelism to Accelerate Protein Motif Finding with a Network Processor
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
ClawHMMER: A Streaming HMMer-Search Implementatio
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Accelerator design for protein sequence HMM search
Proceedings of the 20th annual international conference on Supercomputing
MPI-HMMER-Boost: Distributed FPGA Acceleration
Journal of VLSI Signal Processing Systems
HSP-HMMER: a tool for protein domain identification on a large scale
Proceedings of the 2009 ACM symposium on Applied Computing
Improving MPI-HMMER's scalability with parallel I/O
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Hi-index | 0.00 |
In recent years, the rate of genomics sequence generation increased dramatically due to significant advances in the sequencing technology. The genomics data is accumulating at an exponential rate in various databases all around the world and rapid analysis techniques will enhance the knowledge discovery in the fields of medicine and biotechnology. Analysis of such growing sequence databases demands tremendous computational power that can only be provided by massively parallel computers. Improving the performance and scalability of bioinformatics tools thus becomes a critical step in the quest to transform ever-growing raw genomics data into biological knowledge. In this paper we describe an efficient parallel implementation of a profile hidden Markov models (profile HMMs) code used for protein domain identification, along with auto-tuned parallel I/O optimization. Experimental results show linear speedup with increasing numbers of computing cores on a supercomputer, allowing the domain identification of millions of proteins in few minutes using hundreds of thousands computing cores.