Parallel scheduling of complex dags under uncertainty

Authors:
Grzegorz Malewicz
Affiliations:
University of Alabama, Tuscaloosa, AL and Argonne National Laboratory, Argonne, IL
Venue:
Proceedings of the seventeenth annual ACM symposium on Parallelism in algorithms and architectures
Year:
2005

Citing 21
Cited 11

Introduction to operations research, 4th ed.

Introduction to operations research, 4th ed.
The NP-completeness column: An ongoing guide

Journal of Algorithms
On Time Versus Space

Journal of the ACM (JACM)
Scheduling precedence-constrained jobs with stochastic processing times on parallel machines

SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
A fast approach to computing exact solutions to the resource-constrained scheduling problem

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Sabotage-tolerance mechanisms for volunteer computing systems

Future Generation Computer Systems - Best papers from symp. on cluster computing and the grid (CCGRID 2001)
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Introduction to Algorithms

Introduction to Algorithms
Allocating Bandwidth for Bursty Connections

SIAM Journal on Computing
Applying Chimera virtual data concepts to cluster finding in the Sloan Sky Survey

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Stochastic Load Balancing and Related Problems

FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
I/O complexity: The red-blue pebble game

STOC '81 Proceedings of the thirteenth annual ACM symposium on Theory of computing
Allocating Time and Resources in Project Management Under Uncertainty

HICSS '03 Proceedings of the 36th Annual Hawaii International Conference on System Sciences (HICSS'03) - Track 8 - Volume 8
The maximum edge biclique problem is NP-complete

Discrete Applied Mathematics
The Grid 2: Blueprint for a New Computing Infrastructure

The Grid 2: Blueprint for a New Computing Infrastructure
On Scheduling Mesh-Structured Computations for Internet-Based Computing

IEEE Transactions on Computers
Guidelines for Scheduling Some Common Computation-Dags for Internet-Based Computing

IEEE Transactions on Computers
On Scheduling Complex Dags for Internet-Based Computing

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Distributed computing in practice: the Condor experience: Research Articles

Concurrency and Computation: Practice & Experience - Grid Performance
Batch-Scheduling dags for internet-based computing

Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Internet computing of tasks with dependencies using unreliable workers

OPODIS'04 Proceedings of the 8th international conference on Principles of Distributed Systems

Toward a Theory for Scheduling Dags in Internet-Based Computing

IEEE Transactions on Computers
Approximation algorithms for multiprocessor scheduling under uncertainty

Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
Scheduling DAGs on asynchronous processors

Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
Improved approximations for multiprocessor scheduling under uncertainty

Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
VGrADS: enabling e-Science workflows on grids and clouds with fault tolerance

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Deadline-sensitive workflow orchestration without explicit resource control

Journal of Parallel and Distributed Computing
A case for on-machine load balancing

Journal of Parallel and Distributed Computing
Batch-Scheduling dags for internet-based computing

Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Scheduling tasks with exponential duration on unrelated parallel machines

Discrete Applied Mathematics
Predictable quality of service atop degradable distributed systems

Cluster Computing
Scheduling modular projects on a bottleneck resource

Journal of Scheduling

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper introduces a parallel scheduling problem where a directed acyclic graph modeling t tasks and their dependencies needs to be executed on n unreliable workers. Worker i executes task j correctly with probability pi,j. The goal is to find a regimen Ε, that dictates how workers get assigned to tasks (possibly in parallel and redundantly) throughout execution, so as to minimize expected completion time. This fundamental parallel scheduling problem arises in grid computing and project management fields, and has several practical applications.We show a polynomial time algorithm for the problem restricted to the case when dag width is at most a constant and the number of workers is also at most a constant. These two restrictions may appear to be too severe. However, they are fundamentally required. Specifically, we demonstrate that the problem is NP-hard with constant number of workers when dag width can grow, and is also NP-hard with constant dag width when the number of workers can grow. When both dag width and the number of workers are unconstrained, then the problem is inapproximable within factor less than 5/4, unless P=NP.