NAMD: biomolecular simulation on thousands of processors
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
SuperLU_DIST: A scalable distributed-memory sparse direct solver for unsymmetric linear systems
ACM Transactions on Mathematical Software (TOMS)
Performance of Various Computers Using Standard Linear Equations Software
Performance of Various Computers Using Standard Linear Equations Software
Performance Comparison of MPI Implementations over InfiniBand, Myrinet and Quadrics
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
High performance RDMA-based MPI implementation over infiniBand
International Journal of Parallel Programming - Special issue I: The 17th annual international conference on supercomputing (ICS'03)
RDMA read based rendezvous protocol for MPI over InfiniBand: design alternatives and benefits
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Performance characterization of molecular dynamics techniques for biomolecular simulations
Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Infiniband scalability in open MPI
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Shared receive queue based scalable MPI design for infiniband clusters
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Adaptive connection management for scalable MPI over InfiniBand
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
The university of Florida sparse matrix collection
ACM Transactions on Mathematical Software (TOMS)
High performance MPI design using unreliable datagram for ultra-scale InfiniBand clusters
Proceedings of the 21st annual international conference on Supercomputing
High-performance ethernet-based communications for future multi-core processors
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Evaluating Sparse Data Storage Techniques for MPI Groups and Communicators
ICCS '08 Proceedings of the 8th international conference on Computational Science, Part I
MPC-MPI: An MPI Implementation Reducing the Overall Memory Consumption
Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Impact of Node Level Caching in MPI Job Launch Mechanisms
Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Journal of Parallel and Distributed Computing
Investigations on InfiniBand: efficient network buffer utilization at scale
PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Computers and Electronics in Agriculture
Hi-index | 0.00 |
InfiniBand is an emerging HPC interconnect being deployed in very large scale clusters, with even larger InfiniBand-based clusters expected to be deployed in the near future. The Message Passing Interface (MPI) is the programming model of choice for scientific applications running on these largescale clusters. Thus, it is very critical for the MPI implementation used to be based on a scalable and high-performance design. We analyze the performance and scalability aspects of MVAPICH, a popular open-source MPI implementation on InfiniBand, from an application standpoint. We analyze the performance and memory requirements of the MPI library while executing several well-known applications and benchmarks, such as NAS, SuperLU, NAMD, and HPL on a 64-node InfiniBand cluster. Our analysis reveals that latest design of MVAPICH requires an order of magnitude less internal MPI memory (average per process) and yet delivers the best possible performance. Further, we observe that for these benchmarks and applications evaluated, the internal memory requirement of MVAPICH remains nearly constant at around 5-10 MB as the number of processes increase, indicating that the MVAPICH design is highly scalable.