Resource Management for Elastic Cloud Workflows

  • Authors:
  • Li Yu;Douglas Thain

  • Affiliations:
  • -;-

  • Venue:
  • CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Cloud computing systems have joined campus and private grids as powerful and highly scalable environments for scientific computing. Furthermore, distributed applications are typically expressed in a form that allows them to run on an arbitrary number of nodes while tolerating failures and changes in available resources. This flexibility introduces problems relating to how many nodes an application can use, and how they should be allocated. In this paper, we explore these problems by presenting a general purpose architecture for scalable cloud applications, and describe inherent resource management problems. We address these challenges by developing methods for runtime measurement of the number of nodes an application can use, for appropriately placing masters and workers, and for matching workers to masters. Finally, we propose a resource management mechanism that allows automatic resource allocation and flexible resource distribution. These techniques are presented in the context of our specific cloud architecture, but the lessons apply to any system where competing elastic applications must be right-sized to the available resources.