Models and complexity results for performance and energy optimization of concurrent streaming applications

Authors:
Anne Benoit;Paul Renaud-Goud;Yves Robert
Affiliations:
LIP, Ecole Normale Supérieure de Lyon, France.;LIP, Ecole Normale Supérieure de Lyon, France.;LIP, Ecole Normale Supérieure de Lyon, France.
Venue:
International Journal of High Performance Computing Applications
Year:
2011

Citing 24
Cited 0

Optimal mapping of sequences of data parallel tasks

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Optimal latency-throughput tradeoffs for data parallel pipelines

Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Voltage scheduling problem for dynamically variable voltage processors

ISLPED '98 Proceedings of the 1998 international symposium on Low power electronics and design
Flow and stretch metrics for scheduling continuous job streams

Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Precedence-Constrained Task Allocation onto Point-to-Point Networks for Pipelined Execution

IEEE Transactions on Parallel and Distributed Systems
LEneS: task scheduling for low-energy systems using variable supply voltage processors

Proceedings of the 2001 Asia and South Pacific Design Automation Conference
Patterns and skeletons for parallel and distributed computing

Patterns and skeletons for parallel and distributed computing
A Heuristic Algorithm for Mapping Communicating Tasks on Heterogeneous Resources

HCW '00 Proceedings of the 9th Heterogeneous Computing Workshop
MPICH-G2: a Grid-enabled implementation of the Message Passing Interface

Journal of Parallel and Distributed Computing - Special issue on computational grids
Overhead-Conscious Voltage Selection for Dynamic and Leakage Energy Reduction of Time-Constrained Systems

Proceedings of the conference on Design, automation and test in Europe - Volume 1
Bringing skeletons out of the closet: a pragmatic manifesto for skeletal parallel programming

Parallel Computing
Communication-Aware Task Scheduling and Voltage Selection for Total Systems Energy Minimization

Proceedings of the 2003 IEEE/ACM international conference on Computer-aided design
Minimizing expected energy in real-time embedded systems

Proceedings of the 5th ACM international conference on Embedded software
Power-aware scheduling for makespan and flow

Proceedings of the eighteenth annual ACM symposium on Parallelism in algorithms and architectures
Minimizing expected energy consumption in real-time systems through dynamic voltage scaling

ACM Transactions on Computer Systems (TOCS)
Mapping pipeline skeletons onto heterogeneous platforms

Journal of Parallel and Distributed Computing
A Duplication Based Algorithm for Optimizing Latency Under Throughput Constraints for Streaming Workflows

ICPP '08 Proceedings of the 2008 37th International Conference on Parallel Processing
Supporting Distributed Application Workflows in Heterogeneous Computing Environments

ICPADS '08 Proceedings of the 2008 14th IEEE International Conference on Parallel and Distributed Systems
Energy-Efficient Task Partition for Periodic Real-Time Tasks on Platforms with Dual Processing Elements

ICPADS '08 Proceedings of the 2008 14th IEEE International Conference on Parallel and Distributed Systems
Leakage-Aware Multiprocessor Scheduling

Journal of Signal Processing Systems
On the Interplay of Parallelization, Program Performance, and Energy Consumption

IEEE Transactions on Parallel and Distributed Systems
Complexity Results for Throughput and Latency Optimization of Replicated and Data-parallel Workflows

Algorithmica
Profile-based optimization of power performance by using dynamic voltage scaling on a PC cluster

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Toward optimizing latency under throughput constraints for application workflows on clusters

Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we study the problem of finding optimal mappings for several independent but concurrent workflow applications, in order to optimize performance-related criteria together with energy consumption. Each application consists of a linear chain graph with several stages, and processes successive data sets in pipeline mode, from the first to the last stage. The problem is to decide which processors to enroll, at which speed (or mode) to use them, and which stages they should execute. There is a clear trade-off to reach, since running faster and/or more processors leads to better performance, but energy consumption is then very high. Energy savings can be achieved at the price of a lower performance, by reducing processor speeds or enrolling fewer resources. We study the problem complexity on different target execution platforms, ranking from fully homogeneous platforms to fully heterogeneous ones. We consider three mapping strategies: (i) one-to-one mappings, where a processor is assigned a single stage; (ii) interval mappings, where a processor may process an interval of consecutive stages of the same application; and (iii) general mappings, which are fully arbitrary, i.e. a processor may process stages of several distinct applications. Finally, we compare two different models for the computation of the latency, which is the time elapsed between the beginning and the end of the execution of a given data set: with the PATH model, it is computed as the length of the path taken by this data set, while with the WAVEFRONT model, each data set progresses concurrently within a period. For all platform types, all mapping strategies and both latency models, we establish the complexity of several multi-criteria optimization problems, whose objective functions combine period, latency and energy criteria. In particular, we exhibit instances where the problem is NP-hard with concurrent applications, while it can be solved in polynomial time for a single application, and instances whose problem complexity depends upon the latency model.