Analysis of Performance-impacting Factors on Checkpointing Frameworks

  • Authors:
  • Gabriel Rodríguez;María J. Martín;Patricia González;Juan Touriño

  • Affiliations:
  • -;-;-;-

  • Venue:
  • The Computer Journal
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper focuses on the performance evaluation of Compiler for Portable Checkpointing (CPPC), a tool for the checkpointing of parallel message-passing applications. Its performance and the factors that impact it are transparently and rigorously identified and assessed. The tests were performed on a public supercomputing infrastructure, using a large number of very different applications and showing excellent results in terms of performance and effort required for integration into user codes. Statistical analysis techniques have been used to better approximate the performance of the tool. Quantitative and qualitative comparisons with other rollback-recovery approaches to fault tolerance are also included. All these data and comparisons are then discussed in an effort to extract meaningful conclusions about the state-of-the-art and future research trends in the rollback-recovery field.