Morco: middleware framework for long-running multi-component applications on batch grids

  • Authors:
  • Sivagama Sundari M.;Sathish S. Vadhiyar;Ravi S. Nanjundiah

  • Affiliations:
  • Indian Institute of Science, Bangalore, India;Indian Institute of Science, Bangalore, India;Indian Institute of Science, Bangalore, India

  • Venue:
  • Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

While computational grids with multiple batch systems (batch grids) have been used for efficient executions of loosely-coupled and workflow-based parallel applications, they can also be powerful infrastructures for executing long-running multi-component parallel applications. In this paper, we have constructed a generic middleware framework for executing long-running multi-component applications with execution times much greater than execution time limits of batch queues. Our framework coordinates the distribution, execution, migration and restart of the components of the application on the multiple queues, where the component jobs of the different queues can have different queue waiting and startup times. We have used our framework with a foremost long-running multi-component application for climate modeling, the Community Climate System Model (CCSM). We have performed real multiple-site CCSM runs for 6.5 days of wallclock time spanning three sites with four queues and emulated external workloads. Our experiments indicate that multi-site executions can lead to good throughput of application execution.