ACM SRC poster: SpotMPI: auction-based high performance cloud computing

  • Authors:
  • Moussa Taifi

  • Affiliations:
  • Temple University , Philadelphia, PA, USA

  • Venue:
  • Proceedings of the 2011 companion on High Performance Computing Networking, Storage and Analysis Companion
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Cloud computing benefits extensively from economies of scale to provide cost effective computing. Recently, reliability has been introduced as a potential tradeoff point for delivering compute resources while decreasing further the price of cloud resources. The usage of fair market conditions create an environment where sellers and buyers of compute resources can benefit from trading their resources. The resource use efficiency can potentially be achieved as a result. While there are many advantages to the usage of auction-based infrastructure there are currently no practical computing platforms that can harness such volatile environments effectively. This research work reports a methodology and a toolkit designed to address the challenges of using volatile cloud-based auctioned resources for MPI applications. Specifically we emphasize the use of dynamically adjusted optimal checkpoint-restart (CPR) intervals. We discuss an initial analytical model for dealing with price histories and selecting optimal checkpoint intervals. Also we describe the SpotMPI toolkit that can be used to achieve practical execution of MPI application on volatile auction-based cloud platforms. The result of this exploration is the synthesis of intrinsic dependencies that exist in MPI-based parallel applications with the publicly available price histories of HPC cloud resources on the Amazon cloud. We study algorithms with different computing v.s. communication complexities. Our results show counter-intuitive insights into the optimal bidding and application scaling strategies.