Optimal mapping of sequences of data parallel tasks
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Optimal latency-throughput tradeoffs for data parallel pipelines
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Voltage scheduling problem for dynamically variable voltage processors
ISLPED '98 Proceedings of the 1998 international symposium on Low power electronics and design
Flow and stretch metrics for scheduling continuous job streams
Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Precedence-Constrained Task Allocation onto Point-to-Point Networks for Pipelined Execution
IEEE Transactions on Parallel and Distributed Systems
LEneS: task scheduling for low-energy systems using variable supply voltage processors
Proceedings of the 2001 Asia and South Pacific Design Automation Conference
Patterns and skeletons for parallel and distributed computing
Patterns and skeletons for parallel and distributed computing
A Heuristic Algorithm for Mapping Communicating Tasks on Heterogeneous Resources
HCW '00 Proceedings of the 9th Heterogeneous Computing Workshop
MPICH-G2: a Grid-enabled implementation of the Message Passing Interface
Journal of Parallel and Distributed Computing - Special issue on computational grids
Proceedings of the conference on Design, automation and test in Europe - Volume 1
Communication-Aware Task Scheduling and Voltage Selection for Total Systems Energy Minimization
Proceedings of the 2003 IEEE/ACM international conference on Computer-aided design
Minimizing expected energy in real-time embedded systems
Proceedings of the 5th ACM international conference on Embedded software
Power-aware scheduling for makespan and flow
Proceedings of the eighteenth annual ACM symposium on Parallelism in algorithms and architectures
Minimizing expected energy consumption in real-time systems through dynamic voltage scaling
ACM Transactions on Computer Systems (TOCS)
Mapping pipeline skeletons onto heterogeneous platforms
Journal of Parallel and Distributed Computing
ICPP '08 Proceedings of the 2008 37th International Conference on Parallel Processing
Supporting Distributed Application Workflows in Heterogeneous Computing Environments
ICPADS '08 Proceedings of the 2008 14th IEEE International Conference on Parallel and Distributed Systems
ICPADS '08 Proceedings of the 2008 14th IEEE International Conference on Parallel and Distributed Systems
Leakage-Aware Multiprocessor Scheduling
Journal of Signal Processing Systems
On the Interplay of Parallelization, Program Performance, and Energy Consumption
IEEE Transactions on Parallel and Distributed Systems
Profile-based optimization of power performance by using dynamic voltage scaling on a PC cluster
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Toward optimizing latency under throughput constraints for application workflows on clusters
Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Hi-index | 0.00 |
In this paper, we study the problem of finding optimal mappings for several independent but concurrent workflow applications, in order to optimize performance-related criteria together with energy consumption. Each application consists of a linear chain graph with several stages, and processes successive data sets in pipeline mode, from the first to the last stage. The problem is to decide which processors to enroll, at which speed (or mode) to use them, and which stages they should execute. There is a clear trade-off to reach, since running faster and/or more processors leads to better performance, but energy consumption is then very high. Energy savings can be achieved at the price of a lower performance, by reducing processor speeds or enrolling fewer resources. We study the problem complexity on different target execution platforms, ranking from fully homogeneous platforms to fully heterogeneous ones. We consider three mapping strategies: (i) one-to-one mappings, where a processor is assigned a single stage; (ii) interval mappings, where a processor may process an interval of consecutive stages of the same application; and (iii) general mappings, which are fully arbitrary, i.e. a processor may process stages of several distinct applications. Finally, we compare two different models for the computation of the latency, which is the time elapsed between the beginning and the end of the execution of a given data set: with the PATH model, it is computed as the length of the path taken by this data set, while with the WAVEFRONT model, each data set progresses concurrently within a period. For all platform types, all mapping strategies and both latency models, we establish the complexity of several multi-criteria optimization problems, whose objective functions combine period, latency and energy criteria. In particular, we exhibit instances where the problem is NP-hard with concurrent applications, while it can be solved in polynomial time for a single application, and instances whose problem complexity depends upon the latency model.