Design and Evaluation of Nemesis, a Scalable, Low-Latency, Message-Passing Communication Subsystem
CCGRID '06 Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid
Thread-safety in an MPI implementation: Requirements and analysis
Parallel Computing
Cache-Efficient, Intranode, Large-Message MPI Communication with MPICH2-Nemesis
ICPP '09 Proceedings of the 2009 International Conference on Parallel Processing
Fine-Grained Multithreading Support for Hybrid Threaded MPI Programming
International Journal of High Performance Computing Applications
Hi-index | 0.00 |
Commercial HPC applications are often run on clusters that use the Microsoft Windows operating system and need an MPI implementation that runs efficiently in the Windows environment. The MPI developer community, however, is more familiar with the issues involved in implementing MPI in a Unix environment. In this paper, we discuss some of the differences in implementing MPI on Windows and Unix, particularly with respect to issues such as asynchronous progress, process management, shared-memory access, and threads. We describe how we implement MPICH2 on Windows and exploit these Windows-specific features while still maintaining large parts of the code common with the Unix version. We also present performance results comparing the performance of MPICH2 on Unix and Windows on the same hardware. For zero-byte MPI messages, we measured excellent shared-memory latencies of 240 and 275 nanoseconds on Unix and Windows, respectively.