ScELA: scalable and extensible launching architecture for clusters

Authors:
Jaidev K. Sridhar;Matthew J. Koop;Jonathan L. Perkins;Dhabaleswar K. Panda
Affiliations:
Network-Based Computing Laboratory, The Ohio State University, Columbus, OH;Network-Based Computing Laboratory, The Ohio State University, Columbus, OH;Network-Based Computing Laboratory, The Ohio State University, Columbus, OH;Network-Based Computing Laboratory, The Ohio State University, Columbus, OH
Venue:
HiPC'08 Proceedings of the 15th international conference on High performance computing
Year:
2008

Citing 4
Cited 8

Components and interfaces of a process management system for parallel programs

Parallel Computing - Clusters and computational grids for scientific computing
Scalable parallel application launch on Cplant™

Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Scalable NIC-based Reduction on Large-scale Clusters

Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Design of High Performance MVAPICH2: MPI2 over InfiniBand

CCGRID '06 Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid

A strategy for parallel sorting algorithms evaluation based on MPI technology

AIKED'09 Proceedings of the 8th WSEAS international conference on Artificial intelligence, knowledge engineering and data bases
Impact of Node Level Caching in MPI Job Launch Mechanisms

Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
PMI: a scalable parallel process-management interface for extreme-scale systems

EuroMPI'10 Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface
A multi-level scalable startup for parallel applications

Proceedings of the 1st International Workshop on Runtime and Operating Systems for Supercomputers
Scalable runtime for MPI: efficiently building the communication infrastructure

EuroMPI'11 Proceedings of the 18th European MPI Users' Group conference on Recent advances in the message passing interface
Optimizing latency and throughput for spawning processes on massively multicore processors

Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers
LIBI: A framework for bootstrapping extreme scale software systems

Parallel Computing
Optimizing process creation and execution on multi-core architectures

International Journal of High Performance Computing Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

As cluster sizes head into tens of thousands, current joblaunchmechanisms do not scale as they are limited by resource constraintsas well as performance bottlenecks. The job launch process includes twophases - spawning of processes on processors and information exchange betweenprocesses for job initialization. Implementations of various programmingmodels follow distinct protocols for the information exchange phase.We present the design of a scalable, extensible and high-performance joblaunch architecture for very large scale parallel computing. We present implementationsof this architecture which achieve a speedup of more than700% in launching a simple Hello World MPI application on 10, 240 processorcores and also scale to more than 3 times the number of processorcores compared to prior solutions.