A Performance Evaluation of the Convex SPP-1000 Scalable Shared Memory Parallel Computer
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Low-latency communication on the IBM RISC system/6000 SP
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
MultiMATLAB: integrating MATLAB with high-performance parallel computing
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
A Unified Trace Environment for IBM SP Systems
IEEE Parallel & Distributed Technology: Systems & Technology
Assessing the Performance of the New IBM SP2 Communication Subsystem
IEEE Parallel & Distributed Technology: Systems & Technology
Gang scheduling for highly efficient, distributed multiprocessor systems
FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
Efficient implementation of reduce-scatter in MPI
Journal of Systems Architecture: the EUROMICRO Journal - Special issue: Parallel, distributed and network-based processing
Using triggered operations to offload rendezvous messages
EuroMPI'11 Proceedings of the 18th European MPI Users' Group conference on Recent advances in the message passing interface
Hi-index | 0.01 |
Abstract: In this paper we discuss an implementation of the message passing interface standard (MPI) for the IBM Scalable Power PARALLEL 1 and 2 (SP1, SP2). Key to a reliable and efficient implementation of a message passing library on these machines is the careful design of a UNIX-Socket like layer in the user space with controlled access to the communication adapters and with adequate recovery and flow control. The performance of this implementation is at the same level as the IBM-proprietary message passing library (MPL). We also show that in the IBM SP1 and SP2 we achieve integrated tracing ability, where both system events, such as context switches and page fault etc., and MPI related activities are traced, with minimal overhead to the application program, thus presenting application programmers the trace of all the events that ultimately affect efficiency of a parallel program.