LogP: towards a realistic model of parallel computation
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Effects of communication latency, overhead, and bandwidth in a cluster architecture
Proceedings of the 24th annual international symposium on Computer architecture
Architectural requirements and scalability of the NAS parallel benchmarks
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
MPI-The Complete Reference, Volume 1: The MPI Core
MPI-The Complete Reference, Volume 1: The MPI Core
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
High performance RDMA-based MPI implementation over InfiniBand
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Impact of On-Demand Connection Management in MPI over VIA
CLUSTER '02 Proceedings of the IEEE International Conference on Cluster Computing
An Evaluation of Current High-Performance Networks
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
A General Predictive Performance Model for Wavefront Algorithms on Clusters of SMPs
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Pin-down Cache: A Virtual Memory Management Technique for Zero-copy Communication
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
An analysis of the impact of MPI overlap and independent progress
Proceedings of the 18th annual international conference on Supercomputing
International Journal of High Performance Computing Applications
Performance Evaluation of Deterministic Routings, Multicasts, and Topologies on RHiNET-2 Cluster
IEEE Transactions on Parallel and Distributed Systems
An Application-Based Performance Characterization of the Columbia Supercluster
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Advances, Applications and Performance of the Global Arrays Shared Memory Programming Toolkit
International Journal of High Performance Computing Applications
High Performance Remote Memory Access Communication: The Armci Approach
International Journal of High Performance Computing Applications
PDCN'06 Proceedings of the 24th IASTED international conference on Parallel and distributed computing and networks
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Optimizing communication overlap for high-speed networks
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Coprocessor design to support MPI primitives in configurable multiprocessors
Integration, the VLSI Journal
High performance MPI design using unreliable datagram for ultra-scale InfiniBand clusters
Proceedings of the 21st annual international conference on Supercomputing
Performance evaluation on low-latency Communication mechanism of DIMMnet-2
PDCN'07 Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: parallel and distributed computing and networks
Benchmarking the Columbia Supercluster
International Journal of High Performance Computing Applications
Performance evaluation of the Sun Fire Link SMP clusters
International Journal of High Performance Computing and Networking
Performance evaluation for neutron transport application using message passing
International Journal of High Performance Computing and Networking
Overcoming the processor communication overhead in MPI applications
SpringSim '07 Proceedings of the 2007 spring simulation multiconference - Volume 2
Performance implications of virtualizing multicore cluster machines
Proceedings of the 2nd workshop on System-level virtualization for high performance computing
Packet prediction for speculative cut-through switching
Proceedings of the 4th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
Design optimization of a highly parallel InfiniBand host channel adapter
Proceedings of the 4th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
Evaluating high performance communication: a power perspective
Proceedings of the 23rd international conference on Supercomputing
A speculative and adaptive MPI rendezvous protocol over RDMA-enabled interconnects
International Journal of Parallel Programming
The Importance of Non-Data-Communication Overheads in MPI
International Journal of High Performance Computing Applications
Ensemble routing for datacenter networks
Proceedings of the 6th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
Motivating future interconnects: a differential measurement analysis of PCI latency
Proceedings of the 5th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
Benefits of high speed interconnects to cluster file systems: a case study with lustre
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
A preliminary analysis of the infinipath and XD1 network interfaces
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Performance evaluation of supercomputers using HPCC and IMB benchmarks
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Performance enhancement of SMP clusters with multiple network interfaces using virtualization
ISPA'06 Proceedings of the 2006 international conference on Frontiers of High Performance Computing and Networking
Challenges and issues in benchmarking MPI
EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
Prediction of communication latency over complex network behaviors on SMP clusters
EPEW'05/WS-FM'05 Proceedings of the 2005 international conference on European Performance Engineering, and Web Services and Formal Methods, international conference on Formal Techniques for Computer Systems and Business Processes
WMTools - assessing parallel application memory utilisation at scale
EPEW'11 Proceedings of the 8th European conference on Computer Performance Engineering
Can PDES scale in environments with heterogeneous delays?
Proceedings of the 2013 ACM SIGSIM conference on Principles of advanced discrete simulation
Characterization and modeling of PIDX parallel I/O for performance optimization
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Consolidated cluster systems for data centers in the cloud age: a survey and analysis
Frontiers of Computer Science: Selected Publications from Chinese Universities
International Journal of Bioinformatics Research and Applications
A pilot study: design patterns in parallel program development
SE-HPCCSE '13 Proceedings of the 1st International Workshop on Software Engineering for High Performance Computing in Computational Science and Engineering
Hi-index | 0.00 |
In this paper, we present a comprehensive performance comparison of MPI implementations over Infini-Band, Myrinet and Quadrics. Our performance evaluation consists of two major parts. The first part consists of a set of MPI level micro-benchmarks that characterize different aspects of MPI implementations. The second part of the performance evaluation consists of application level benchmarks. We have used the NAS Parallel Benchmarks and the sweep3D benchmark. We not only present the overall performance results, but also relate application communication characteristics to the information we acquired from the micro-benchmarks. Our results show that the three MPI implementations all have their advantages and disadvantages. For our 8-node cluster, InfiniBand can offer significant performance improvements for a number of applications compared with Myrinet and Quadrics when using the PCI-X bus. Even with just the PCI bus, InfiniBand can still perform better if the applications are bandwidth-bound.