Optimizing latency and throughput of application workflows on clusters

Authors:
Naga Vydyanathan;Umit Catalyurek;Tahsin Kurc;Ponnuswamy Sadayappan;Joel Saltz
Affiliations:
Dept. of Computer Science and Engineering, The Ohio State University, United States;Dept. of Biomedical Informatics, The Ohio State University, United States and Dept. of Electrical and Computer Engineering, The Ohio State University, United States;Center for Comprehensive Informatics, Emory University, United States;Dept. of Computer Science and Engineering, The Ohio State University, United States;Center for Comprehensive Informatics, Emory University, United States
Venue:
Parallel Computing
Year:
2011

Citing 25
Cited 2

Towards an architecture-independent analysis of parallel algorithms

SIAM Journal on Computing
Scheduling pipelined communication in distributed memory multiprocessors for real-time applications

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Applications and performance analysis of a compile-time optimization approach for list scheduling algorithms on distributed memory multiprocessors

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Optimal latency-throughput tradeoffs for data parallel pipelines

Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
On Exploiting Task Duplication in Parallel Program Scheduling

IEEE Transactions on Parallel and Distributed Systems
Precedence-Constrained Task Allocation onto Point-to-Point Networks for Pipelined Execution

IEEE Transactions on Parallel and Distributed Systems
Static scheduling algorithms for allocating directed task graphs to multiprocessors

ACM Computing Surveys (CSUR)
Grain Size Determination for Parallel Processing

IEEE Software
Optimal Processor Assignment for a Class of Pipelined Computations

IEEE Transactions on Parallel and Distributed Systems
A Pipeline-Based Approach for Scheduling Video Processing Algorithms on NOW

IEEE Transactions on Parallel and Distributed Systems
Executing multiple pipelined data analysis operations in the grid

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Design, Implementation and Evaluation of Parallel Pipelined STAP on Parallel Computers

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Resource allocation in a middleware for streaming data

MGC '04 Proceedings of the 2nd workshop on Middleware for grid computing
A static resource allocation framework for Grid-based streaming applications: Research Articles

Concurrency and Computation: Practice & Experience - Middleware for Grid Computing
Integrated scratchpad memory optimization and task scheduling for MPSoC architectures

CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Large image correction and warping in a cluster environment

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Real-time scheduling for pipelined execution of data flow graphs on a realistic multiprocessor architecture

ICASSP '96 Proceedings of the Acoustics, Speech, and Signal Processing, 1996. on Conference Proceedings., 1996 IEEE International Conference - Volume 06
Mapping pipeline skeletons onto heterogeneous platforms

Journal of Parallel and Distributed Computing
Bi-criteria Pipeline Mappings for Parallel Image Processing

ICCS '08 Proceedings of the 8th international conference on Computational Science, Part I
A Duplication Based Algorithm for Optimizing Latency Under Throughput Constraints for Streaming Workflows

ICPP '08 Proceedings of the 2008 37th International Conference on Parallel Processing
Complexity results for throughput and latency optimization of replicated and data-parallel workflows

CLUSTER '07 Proceedings of the 2007 IEEE International Conference on Cluster Computing
Multi-criteria scheduling of pipeline workflows

CLUSTER '07 Proceedings of the 2007 IEEE International Conference on Cluster Computing
Compaction of Schedules and a Two-Stage Approach for Duplication-Based DAG Scheduling

IEEE Transactions on Parallel and Distributed Systems
A task duplication based bottom-up scheduling algorithm for heterogeneous environments

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Toward optimizing latency under throughput constraints for application workflows on clusters

Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing

Enhancing throughput for streaming applications running on cluster systems

Journal of Parallel and Distributed Computing
A survey of pipelined workflow scheduling: Models and algorithms

ACM Computing Surveys (CSUR)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Scheduling, in many application domains, involves optimization of multiple performance metrics. For example, application workflows with real-time constraints have strict throughput requirements and also desire a low latency or response time. In this paper, we present a novel algorithm for the scheduling of workflows that act on a stream of input data. Our algorithm focuses on the two performance metrics, latency and throughput, and minimizes the latency of workflows while satisfying strict throughput requirements. We also describe steps to use the above approach to solve the problem of meeting latency requirements while maximizing throughput. We leverage pipelined, task and data parallelism in a coordinated manner to meet these objectives and investigate the benefit of task duplication in alleviating communication overheads in the pipelined schedule for different workflow characteristics. The proposed algorithm is designed for a realistic bounded multi-port communication model, where each processor can simultaneously communicate with at most k distinct processors. Experimental evaluation using synthetic benchmarks as well as those derived from real applications shows that our algorithm consistently produces lower latency schedules that meet throughput requirements, even when previously proposed schemes fail.