Cost optimized provisioning of elastic resources for application workflows

Authors:
Eun-Kyu Byun;Yang-Suk Kee;Jin-Soo Kim;Seungryoul Maeng
Affiliations:
Department of Computer Science, Korea Advanced Institute of Science and Technology, Daejeon 305-701, South Korea;Oracle USA Inc., Redwood Shores, CA 94065, USA;School of Information and Communication Eng., Sungkyunkwan University, Suwon, Gyeonggi-do 440-746, South Korea;Department of Computer Science, Korea Advanced Institute of Science and Technology, Daejeon 305-701, South Korea
Venue:
Future Generation Computer Systems
Year:
2011

Citing 23
Cited 8

A Framework for Exploiting Task and Data Parallelism on Distributed Memory Multicomputers

IEEE Transactions on Parallel and Distributed Systems
Compiler support for task scheduling in hierarchical execution models

Journal of Systems Architecture: the EUROMICRO Journal - Special issue on tools and environments for parallel program development
Benchmarking and comparison of the task graph scheduling algorithms

Journal of Parallel and Distributed Computing
A comparison of list schedules for parallel processing systems

Communications of the ACM
Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing

IEEE Transactions on Parallel and Distributed Systems
A Low-Cost Approach towards Mixed Task and Data Parallel Scheduling

ICPP '02 Proceedings of the 2001 International Conference on Parallel Processing
Resource Estimation and Task Scheduling for Multithreaded Reconfigurable Architectures

ICPADS '04 Proceedings of the Parallel and Distributed Systems, Tenth International Conference
ASKALON: a tool set for cluster and Grid computing: Research Articles

Concurrency and Computation: Practice & Experience - Grid Performance
Efficient resource description and high quality selection for virtual grids

CCGRID '05 Proceedings of the Fifth IEEE International Symposium on Cluster Computing and the Grid - Volume 01
CASA and LEAD: Adaptive Cyberinfrastructure for Real-Time Multiscale Weather Forecasting

Computer
Improving grid resource allocation via integrated selection and binding

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Pegasus: A framework for mapping complex scientific workflows onto distributed systems

Scientific Programming
MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Sharing networked resources with brokered leases

ATEC '06 Proceedings of the annual conference on USENIX '06 Annual Technical Conference
Dryad: distributed data-parallel programs from sequential building blocks

Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
A Dynamic Critical Path Algorithm for Scheduling Scientific Workflow Applications on Global Grids

E-SCIENCE '07 Proceedings of the Third IEEE International Conference on e-Science and Grid Computing
Automatic resource specification generation for resource selection

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Bi-criteria Scheduling of Scientific Workflows for the Grid

CCGRID '08 Proceedings of the 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid
Estimating Resource Needs for Time-Constrained Workflows

ESCIENCE '08 Proceedings of the 2008 Fourth IEEE International Conference on eScience
Dynamic Provisioning of Virtual Organization Clusters

CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
Elastic Site: Using Clouds to Elastically Extend Site Resources

CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Spark: cluster computing with working sets

HotCloud'10 Proceedings of the 2nd USENIX conference on Hot topics in cloud computing
BTS: Resource capacity estimate for time-targeted science workflows

Journal of Parallel and Distributed Computing

An Analysis of Provisioning and Allocation Policies for Infrastructure-as-a-Service Clouds

CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
Deadline-constrained workflow scheduling algorithms for Infrastructure as a Service Clouds

Future Generation Computer Systems
Characterizing and profiling scientific workflows

Future Generation Computer Systems
Adjusting process count on demand for petascale global optimization

Parallel Computing
Let the clouds compute: cost-efficient workload distribution in infrastructure clouds

GECON'12 Proceedings of the 9th international conference on Economics of Grids, Clouds, Systems, and Services
A family of heuristics for agent-based elastic Cloud bag-of-tasks concurrent scheduling

Future Generation Computer Systems
Bi-level fuzzy based advanced reservation of Cloud workflow applications on distributed Grid resources

The Journal of Supercomputing
Versatile time-cost algorithm VTCA for scheduling non-preemptive tasks of time critical workflows in cloud computing systems

International Journal of Communication Networks and Distributed Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Workflow technologies have become a major vehicle for easy and efficient development of scientific applications. In the meantime, state-of-the-art resource provisioning technologies such as cloud computing enable users to acquire computing resources dynamically and elastically. A critical challenge in integrating workflow technologies with resource provisioning technologies is to determine the right amount of resources required for the execution of workflows in order to minimize the financial cost from the perspective of users and to maximize the resource utilization from the perspective of resource providers. This paper suggests an architecture for the automatic execution of large-scale workflow-based applications on dynamically and elastically provisioned computing resources. Especially, we focus on its core algorithm named PBTS (Partitioned Balanced Time Scheduling), which estimates the minimum number of computing hosts required to execute a workflow within a user-specified finish time. The PBTS algorithm is designed to fit both elastic resource provisioning models such as Amazon EC2 and malleable parallel application models such as MapReduce. The experimental results with a number of synthetic workflows and several real science workflows demonstrate that PBTS estimates the resource capacity close to the theoretical low bound.