MPICH-V: toward a scalable fault tolerant MPI for volatile nodes

Authors:
George Bosilca;Aurelien Bouteiller;Franck Cappello;Samir Djilali;Gilles Fedak;Cecile Germain;Thomas Herault;Pierre Lemarinier;Oleg Lodygensky;Frederic Magniette;Vincent Neri;Anton Selikhov
Affiliations:
LRI, Université de Paris Sud, Orsay, France;LRI, Université de Paris Sud, Orsay, France;LRI, Université de Paris Sud, Orsay, France;LRI, Université de Paris Sud, Orsay, France;LRI, Université de Paris Sud, Orsay, France;LRI, Université de Paris Sud, Orsay, France;LRI, Université de Paris Sud, Orsay, France;LRI, Université de Paris Sud, Orsay, France;LRI, Université de Paris Sud, Orsay, France;LRI, Université de Paris Sud, Orsay, France;LRI, Université de Paris Sud, Orsay, France;LRI, Université de Paris Sud, Orsay, France
Venue:
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Year:
2002

Citing 11
Cited 59

Optimistic recovery in distributed systems

ACM Transactions on Computer Systems (TOCS)
A high-performance, portable implementation of the MPI message passing interface standard

Parallel Computing
A security architecture for computational grids

CCS '98 Proceedings of the 5th ACM conference on Computer and communications security
Feasibility of a serverless distributed file system deployed on an existing set of desktop PCs

Proceedings of the 2000 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
A grid-enabled MPI: message passing in heterogeneous distributed computing systems

SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
CLIP: a checkpointing tool for message-passing parallel programs

SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
CoCheck: Checkpointing and Process Migration for MPI

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
MPICH-CM: A Communication Library Design for a P2P MPI Implementation

Proceedings of the 9th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
MPI/FTTM: Architecture and Taxonomies for Fault-Tolerant, Message-Passing Middleware for Performance-Portable Parallel Computing

CCGRID '01 Proceedings of the 1st International Symposium on Cluster Computing and the Grid
Egida: An Extensible Toolkit For Low-Overhead Fault-Tolerance

FTCS '99 Proceedings of the Twenty-Ninth Annual International Symposium on Fault-Tolerant Computing
Starfish: Fault-Tolerant Dynamic MPI Programs on Clusters of Workstations

HPDC '99 Proceedings of the 8th IEEE International Symposium on High Performance Distributed Computing

Phoenix: a parallel programming model for accommodating dynamically joining/leaving resources

Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
A network-failure-tolerant message-passing system for terascale clusters

International Journal of Parallel Programming
MPICH-V2: a Fault Tolerant MPI for Volatile Nodes based on Pessimistic Sender Based Message Logging

Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Impact of Event Logger on Causal Message Logging Protocols for Fault Tolerant MPI

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Fault-Tolerant Parallel Applications with Dynamic Parallel Schedules

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 16 - Volume 17
Computing on large-scale distributed systems: Xtrem Web architecture, programming models, security, tests and convergence with grid

Future Generation Computer Systems - Special issue: P2P computing and interaction with grids
Fault Tolerance in Message Passing Interface Programs

International Journal of High Performance Computing Applications
Building and Using a Fault-Tolerant MPI Implementation

International Journal of High Performance Computing Applications
A Simple MPI Process Swapping Architecture for Iterative Applications

International Journal of High Performance Computing Applications
A channel memory based fault tolerance for MPI applications

Future Generation Computer Systems - Special issue: Parallel computing technologies
Design and Implementation of Multiple Fault-Tolerant MPI over Myrinet (M^3)

SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
A Faster Checkpointing and Recovery Algorithm with a Hierarchical Storage Approach

HPCASIA '05 Proceedings of the Eighth International Conference on High-Performance Computing in Asia-Pacific Region
Scalable, fault tolerant membership for MPI tasks on HPC systems

Proceedings of the 20th annual international conference on Supercomputing
Message passing over windows-based desktop grids

Proceedings of the 4th international workshop on Middleware for grid computing
Blocking vs. non-blocking coordinated checkpointing for large-scale fault tolerant MPI

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Declarative failure recovery for sensor networks

Proceedings of the 6th international conference on Aspect-oriented software development
Worldwide computing: Adaptive middleware and programming technology for dynamic Grid environments

Scientific Programming - Dynamic Grids and Worldwide Computing
Proactive fault tolerance for HPC with Xen virtualization

Proceedings of the 21st annual international conference on Supercomputing
Coordinated checkpoint versus message log for fault tolerant MPI

International Journal of High Performance Computing and Networking
Fault tolerant algorithms for heat transfer problems

Journal of Parallel and Distributed Computing
Proactive process-level live migration in HPC environments

Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Experimental Assessment of the Practicality of a Fault-Tolerant System

SOFSEM '07 Proceedings of the 33rd conference on Current Trends in Theory and Practice of Computer Science
Workflow Global Computing with YML

GRID '06 Proceedings of the 7th IEEE/ACM International Conference on Grid Computing
Fault Tolerance in Petascale/ Exascale Systems: Current Knowledge, Challenges and Research Opportunities

International Journal of High Performance Computing Applications
MPISec I/O: Providing Data Confidentiality in MPI-I/O

CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
On correlated availability in Internet-distributed systems

GRID '08 Proceedings of the 2008 9th IEEE/ACM International Conference on Grid Computing
MPI on a Million Processors

Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
In-Memory Checkpointing for MPI Programs by XOR-Based Double-Erasure Codes

Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Toward Exascale Resilience

International Journal of High Performance Computing Applications
A fault-tolerant strategy for virtualized HPC clusters

The Journal of Supercomputing
A Channel Memory based fault tolerance for MPI applications

Future Generation Computer Systems - Special issue: Parallel computing technologies
Application execution management on the InteGrade opportunistic grid middleware

Journal of Parallel and Distributed Computing
JaceV: a programming and execution environment for asynchronous iterative computations on volatile nodes

VECPAR'06 Proceedings of the 7th international conference on High performance computing for computational science
Performance evaluation of an application-level checkpointing solution on grids

Future Generation Computer Systems
Team-Based Message Logging: Preliminary Results

CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Piccolo: building fast, distributed programs with partitioned tables

OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Algorithm-based recovery for iterative methods without checkpointing

Proceedings of the 20th international symposium on High performance distributed computing
Mobile multimedia for multiuser environments

Journal of Mobile Multimedia
An intelligent management of fault tolerance in cluster using RADICMPI

EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
Can MPI be used for persistent parallel services?

EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
SHIELD: a fault-tolerant MPI for an infiniband cluster

HPCC'06 Proceedings of the Second international conference on High Performance Computing and Communications
Open MPI: a flexible high performance MPI

PPAM'05 Proceedings of the 6th international conference on Parallel Processing and Applied Mathematics
An architecture for reconfigurable iterative MPI applications in dynamic environments

PPAM'05 Proceedings of the 6th international conference on Parallel Processing and Applied Mathematics
Self-refined fault tolerance in HPC using dynamic dependent process groups

IWDC'05 Proceedings of the 7th international conference on Distributed Computing
Scalable fault tolerant MPI: extending the recovery algorithm

PVM/MPI'05 Proceedings of the 12th European PVM/MPI users' group conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Migol: a fault-tolerant service framework for MPI applications in the grid

PVM/MPI'05 Proceedings of the 12th European PVM/MPI users' group conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Proactive process-level live migration and back migration in HPC environments

Journal of Parallel and Distributed Computing
A novel checkpoint mechanism based on job progress description for computational grid

ISPA'04 Proceedings of the Second international conference on Parallel and Distributed Processing and Applications
Parallel fault tolerant algorithms for parabolic problems

Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
An integrated architecture for qos-enable router and grid-oriented supercomputer

ICCNMC'05 Proceedings of the Third international conference on Networking and Mobile Computing
Estimation of MPI application performance on volunteer environments

Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing
Data-driven fault tolerance for work stealing computations

Proceedings of the 26th ACM international conference on Supercomputing
Independent checkpointing in a heterogeneous grid environment

Future Generation Computer Systems
Tuple switching network-When slower may be better

Journal of Parallel and Distributed Computing
Alleviating scalability issues of checkpointing protocols

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
enhancing fault-tolerance of large-scale MPI scientific applications

PaCT'07 Proceedings of the 9th international conference on Parallel Computing Technologies
Open issues in MPI implementation

ACSAC'07 Proceedings of the 12th Asia-Pacific conference on Advances in Computer Systems Architecture
Low cost self-healing in MPI applications

PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
The viability of using compression to decrease message log sizes

Euro-Par'12 Proceedings of the 18th international conference on Parallel processing workshops

Quantified Score

Hi-index	0.00

Visualization

Abstract

Global Computing platforms, large scale clusters and future TeraGRID systems gather thousands of nodes for computing parallel scientific applications. At this scale, node failures or disconnections are frequent events. This Volatility reduces the MTBF of the whole system in the range of hours or minutes.We present MPICH-V, an automatic Volatility tolerant MPI environment based on uncoordinated checkpoint/rollback and distributed message logging. MPICH-V architecture relies on Channel Memories, Checkpoint servers and theoretically proven protocols to execute existing or new, SPMD and Master-Worker MPI applications on volatile nodes.To evaluate its capabilities, we run MPICH-V within a framework for which the number of nodes, Channels Memories and Checkpoint Servers can be completely configured as well as the node Volatility. We present a detailed performance evaluation of every component of MPICH-V and its global performance for non-trivial parallel applications. Experimental results demonstrate good scalability and high tolerance to node volatility.