Process control and scheduling issues for multiprogrammed shared-memory multiprocessors
SOSP '89 Proceedings of the twelfth ACM symposium on Operating systems principles
Processor scheduling in shared memory multiprocessors
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
The performance of multiprogrammed multiprocessor scheduling algorithms
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
ACM Transactions on Programming Languages and Systems (TOPLAS)
Efficient message passing interface (MPI) for parallel computing on clusters of workstations
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Pthreads programming
The Nexus approach to integrating multithreading and communication
Journal of Parallel and Distributed Computing - Special issue on multithreading for multiprocessors
Scheduler-conscious synchronization
ACM Transactions on Computer Systems (TOCS)
Space and time efficient execution of parallel irregular computations
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Elimination forest guided 2D sparse LU factorization
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Thread scheduling for multiprogrammed multiprocessors
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Computer organization and design (2nd ed.): the hardware/software interface
Computer organization and design (2nd ed.): the hardware/software interface
The grid: blueprint for a new computing infrastructure
The grid: blueprint for a new computing infrastructure
MPI-SIM: using parallel simulation to evaluate MPI programs
Proceedings of the 30th conference on Winter simulation
Parallel simulation of parallel file systems and I/O programs
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
MPI: The Complete Reference
Parallel Computer Architecture: A Hardware/Software Approach
Parallel Computer Architecture: A Hardware/Software Approach
The Performance of Spin Lock Alternatives for Shared-Memory Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
TPVM: distributed concurrent computing with lightweight processes
HPDC '95 Proceedings of the 4th IEEE International Symposium on High Performance Distributed Computing
Dynamic Processor Allocation with the Solaris Operating System
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Managing Concurrent Access for Shared Memory Active Messages
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
MPIDC '96 Proceedings of the Second MPI Developers Conference
Compile/Run-time Support for Threaded MPI Execution on Multiprogrammed Shared Memory Machines
Compile/Run-time Support for Threaded MPI Execution on Multiprogrammed Shared Memory Machines
Adaptive two-level thread management for fast MPI execution on shared memory machines
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Learning from the Success of MPI
HiPC '01 Proceedings of the 8th International Conference on High Performance Computing
Adaptive Load Balancing for MPI Programs
ICCS '01 Proceedings of the International Conference on Computational Science-Part II
MPIT - Communication/Computation Paradigm for Networks of SMP Workstations
PARA '02 Proceedings of the 6th International Conference on Applied Parallel Computing Advanced Scientific Computing
Single Data Copying for MPI Communication Optimization on Shared Memory System
ICCS '07 Proceedings of the 7th international conference on Computational Science, Part I: ICCS 2007
Hi-index | 0.00 |
MPI is a message-passing standard widely used for developing high-performance parallel applications. Because of the restriction in the MPI computation model, conventional implementations on shared memory machines map each MPI node to an OS process, which suffers serious performance degradation in the presence of multiprogramming, especially when a space/time sharing policy is employed in OS job scheduling. In this paper, we study compile-time and run-time support for MPI by using threads and demonstrate our optimization techniques for executing a large class of MPI programs written in C. The compile-time transformation adopts thread-specific data structures to eliminate the use of global and static variables in C code. The runtime support includes an efficient point-to-point communication protocol based on a novel lock-free queue management scheme. Our experiments on an SGI Origin 2000 show that our MPI prototype called TMPI using the proposed techniques is competitive with SGI's native MPI implementation in a dedicated environment, and it has significant performance advantages with up to a 23-fold improvement in a multiprogrammed environment.