Scheduling data processing flows under budget constraint on the cloud

  • Authors:
  • Fei Cao;Dabin Ding;Dunren Che;Michelle M. Zhu;Wen-Chi Hou

  • Affiliations:
  • Southern Illinois University Carbondale, IL;Southern Illinois University Carbondale, IL;Illinois University Carbondale, IL;Illinois University Carbondale, IL;Illinois University Carbondale, IL

  • Venue:
  • Proceedings of the 2013 Research in Adaptive and Convergent Systems
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Cloud computing is emerging as a promising paradigm for large-scale data-intensive queries modeled as complex Directed Acyclic Graph (DAG)-structured dataflows with arbitrary data operators as nodes and producer-consumer interactions as directed edges. The optimization problem of scheduling dataflows on the Cloud is a very complex and challenging task which is similar to query optimization. Optimization must satisfy a variety of objectives and constraints, while taking into account the particular characteristics of the underlying Cloud environment. In addition to achieving minimum completion time, the commercialization of Clouds requires policies to take users' economic concerns as well. In this paper, we formulate scheduling of dataflows onto Cloud resources under the objective of minimizing the completion time under certain budget constraint. A heuristic scheduling algorithm, Layer-oriented Resource Allocation within Budget constraint (LRA-B) is proposed and evaluated. Experiments are conducted on numerous dataflows and Cloud environment configurations, and the overall results are quite promising and indicate the effectiveness of our algorithm.