Grid Technology with Dynamic Load Balancing for Monte Carlo Simulations
PARA '02 Proceedings of the 6th International Conference on Applied Parallel Computing Advanced Scientific Computing
An Overview of Checkpointing in Uniprocessor and DistributedSystems, Focusing on Implementation and Performance
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
SimGrid: A Generic Framework for Large-Scale Distributed Experiments
UKSIM '08 Proceedings of the Tenth International Conference on Computer Modeling and Simulation
Flexible and Efficient Workflow Deployment of Data-Intensive Applications On Grids With MOTEUR
International Journal of High Performance Computing Applications
MapReduce for Data Intensive Scientific Analyses
ESCIENCE '08 Proceedings of the 2008 Fourth IEEE International Conference on eScience
NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
Processing moldable tasks on the grid: Late job binding with lightweight user-level overlay
Future Generation Computer Systems
MARLA: MapReduce for Heterogeneous Clusters
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Investigation of Data Locality in MapReduce
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
TomusBlobs: Towards Communication-Efficient Storage for MapReduce Applications in Azure
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
A parallel computational model for GATE simulations
Computer Methods and Programs in Biomedicine
Hi-index | 0.00 |
This paper introduces an end-to-end framework for efficient computing and merging of Monte Carlo simulations on heterogeneous distributed systems. Simulations are parallelized using a dynamic load-balancing approach and multiple parallel mergers. Checkpointing is used to improve reliability and to enable incremental results merging from partial results. A model is proposed to analyze the behavior of the proposed framework and help tune its parameters. Experimental results obtained on a production grid infrastructure show that the model fits the real makespan with a relative error of maximum 10%, that using multiple parallel mergers reduces the makespan by 40% on average, that checkpointing enables the completion of very long simulations and that it can be used without penalizing the makespan.