PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing
PVM: Parallel virtual machine: a users' guide and tutorial for networked parallel computing
A design study of the EARTH multiprocessor
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Programming with POSIX threads
Programming with POSIX threads
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Parallel programming in OpenMP
Parallel programming in OpenMP
Bioinformatics: the machine learning approach
Bioinformatics: the machine learning approach
Building Multithreaded Architectures with Off-the-Shelf Microprocessors
Proceedings of the 8th International Symposium on Parallel Processing
Home-Based SVM Protocols for SMP Clusters: Design and Performance
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
High-Speed, Wide Area, Data Intensive Computing: A Ten Year Retrospective
HPDC '98 Proceedings of the 7th IEEE International Symposium on High Performance Distributed Computing
Earth: an efficient architecture for running threads
Earth: an efficient architecture for running threads
An adaptive grid implementation of DNA sequence alignment
Future Generation Computer Systems
Constructing large suffix trees on a computational grid
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
Hmmpfam is a widely used computation-intensive bioinformatics software for sequence classification. The contribution of this paper is the first largely scalable and robust cluster-based solution of parallel hmmpfam based on EARTH (Efficient Architecture for Running Threads), which is an eventdriven fine-grain multi-threaded programming execution model. When compared with the original PVM implementation, our implementation shows notable improvements on absolute speed-up and better scalability. Experiments on two advanced supercomputing clusters at Argonne National Laboratory achieve an absolute speedup of 222.8 on 128 dual-CPU nodes for a representative data set, which means that the total execution time is reduced from 15.9 h (serial program) to only 4.3 min.