Monte Carlo simulation on heterogeneous distributed systems: A computing framework with parallel merging and checkpointing strategies

  • Authors:
  • Sorina Camarasu-Pop;Tristan Glatard;Rafael Ferreira Da Silva;Pierre Gueth;David Sarrut;Hugues Benoit-Cattin

  • Affiliations:
  • -;-;-;-;-;-

  • Venue:
  • Future Generation Computer Systems
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper introduces an end-to-end framework for efficient computing and merging of Monte Carlo simulations on heterogeneous distributed systems. Simulations are parallelized using a dynamic load-balancing approach and multiple parallel mergers. Checkpointing is used to improve reliability and to enable incremental results merging from partial results. A model is proposed to analyze the behavior of the proposed framework and help tune its parameters. Experimental results obtained on a production grid infrastructure show that the model fits the real makespan with a relative error of maximum 10%, that using multiple parallel mergers reduces the makespan by 40% on average, that checkpointing enables the completion of very long simulations and that it can be used without penalizing the makespan.