Components and interfaces of a process management system for parallel programs
Parallel Computing - Clusters and computational grids for scientific computing
Scalable parallel application launch on Cplant™
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Scalable NIC-based Reduction on Large-scale Clusters
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Design of High Performance MVAPICH2: MPI2 over InfiniBand
CCGRID '06 Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid
A strategy for parallel sorting algorithms evaluation based on MPI technology
AIKED'09 Proceedings of the 8th WSEAS international conference on Artificial intelligence, knowledge engineering and data bases
Impact of Node Level Caching in MPI Job Launch Mechanisms
Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
PMI: a scalable parallel process-management interface for extreme-scale systems
EuroMPI'10 Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface
A multi-level scalable startup for parallel applications
Proceedings of the 1st International Workshop on Runtime and Operating Systems for Supercomputers
Scalable runtime for MPI: efficiently building the communication infrastructure
EuroMPI'11 Proceedings of the 18th European MPI Users' Group conference on Recent advances in the message passing interface
Optimizing latency and throughput for spawning processes on massively multicore processors
Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers
LIBI: A framework for bootstrapping extreme scale software systems
Parallel Computing
Optimizing process creation and execution on multi-core architectures
International Journal of High Performance Computing Applications
Hi-index | 0.00 |
As cluster sizes head into tens of thousands, current joblaunchmechanisms do not scale as they are limited by resource constraintsas well as performance bottlenecks. The job launch process includes twophases - spawning of processes on processors and information exchange betweenprocesses for job initialization. Implementations of various programmingmodels follow distinct protocols for the information exchange phase.We present the design of a scalable, extensible and high-performance joblaunch architecture for very large scale parallel computing. We present implementationsof this architecture which achieve a speedup of more than700% in launching a simple Hello World MPI application on 10, 240 processorcores and also scale to more than 3 times the number of processorcores compared to prior solutions.