Mapping pipeline skeletons onto heterogeneous platforms

Authors:
Anne Benoit;Yves Robert
Affiliations:
Laboratoire LIP, UMR CNRS-INRIA-UCBL 5668, ícole Normale Supérieure de Lyon, 46 allée d'Italie, 69364 Lyon Cedex 07, France;Laboratoire LIP, UMR CNRS-INRIA-UCBL 5668, ícole Normale Supérieure de Lyon, 46 allée d'Italie, 69364 Lyon Cedex 07, France
Venue:
Journal of Parallel and Distributed Computing
Year:
2008

Citing 28
Cited 20

Partitioning Problems in Parallel, Pipeline, and Distributed Computing

IEEE Transactions on Computers
Optimum Broadcasting and Personalized Communication in Hypercubes

IEEE Transactions on Computers
Gossiping in minimal time

SIAM Journal on Computing
Improved Algorithms for Partitioning Problems in Parallel, Pipelined, and Distributed Computing

IEEE Transactions on Computers
Efficient Algorithms for a Class of Partitioning Problems

IEEE Transactions on Parallel and Distributed Systems
Optimal mapping of sequences of data parallel tasks

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Optimal latency-throughput tradeoffs for data parallel pipelines

Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Task Allocation on a Network of Processors

IEEE Transactions on Computers
Complexity and Approximation: Combinatorial Optimization Problems and Their Approximability Properties

Complexity and Approximation: Combinatorial Optimization Problems and Their Approximability Properties
Optimizing execution of component-based applications using group instances

Future Generation Computer Systems - Best papers from symp. on cluster computing and the grid (CCGRID 2001)
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Scheduling and Load Balancing in Parallel and Distributed Systems

Scheduling and Load Balancing in Parallel and Distributed Systems
Efficient Partitioning of Sequences

IEEE Transactions on Computers
A Realistic Model and an Efficient Heuristic for Scheduling with Heterogeneous Processors

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Executing multiple pipelined data analysis operations in the grid

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Patterns and skeletons for parallel and distributed computing

Patterns and skeletons for parallel and distributed computing
Performance Optimization for Data Intensive Grid Applications

AMS '01 Proceedings of the Third Annual International Workshop on Active Middleware Services
A Dynamic Matching and Scheduling Algorithm for Heterogeneous Computing Systems

HCW '98 Proceedings of the Seventh Heterogeneous Computing Workshop
Task Scheduling Algorithms for Heterogeneous Processors

HCW '99 Proceedings of the Eighth Heterogeneous Computing Workshop
A Heuristic Algorithm for Mapping Communicating Tasks on Heterogeneous Resources

HCW '00 Proceedings of the 9th Heterogeneous Computing Workshop
Efficient collective communication in distributed heterogeneous systems

Journal of Parallel and Distributed Computing
Comparison of Contention Aware List Scheduling Heuristics for Cluster Computing

ICPPW '01 Proceedings of the 2001 International Conference on Parallel Processing Workshops
Efficient Collective Communication in Distributed Heterogeneous Systems

ICDCS '99 Proceedings of the 19th IEEE International Conference on Distributed Computing Systems
Power-Aware Scheduling for Periodic Real-Time Tasks

IEEE Transactions on Computers
Bringing skeletons out of the closet: a pragmatic manifesto for skeletal parallel programming

Parallel Computing
Fast optimal load balancing algorithms for 1D partitioning

Journal of Parallel and Distributed Computing
Assessing the Impact and Limits of Steady-State Scheduling for Mixed Task and Data Parallelism on Heterogeneous Platforms

ISPDC '04 Proceedings of the Third International Symposium on Parallel and Distributed Computing/Third International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Networks
Scheduling Skeleton-Based Grid Applications Using PEPA and NWS

The Computer Journal

Multi-Criteria Scheduling of Pipeline Workflows (and Application To the JPEG Encoder)

International Journal of High Performance Computing Applications
Evolutionary algorithms for the mapping of pipelined applications onto heterogeneous embedded systems

Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Mapping filtering streaming applications with communication costs

Proceedings of the twenty-first annual symposium on Parallelism in algorithms and architectures
Mapping pipelined applications onto heterogeneous embedded systems: a bayesian optimization algorithm based approach

CODES+ISSS '09 Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis
Simulation-based analysis of performance dynamics of distributed applications in heterogeneous network environments

SpringSim '09 Proceedings of the 2009 Spring Simulation Multiconference
Computing the throughput of probabilistic and replicated streaming applications

Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
A survey of algorithmic skeleton frameworks: high-level structured parallel programming enablers

Software—Practice & Experience - Focus on Selected PhD Literature Reviews in the Practical Aspects of Software Technology
Optimizing end-to-end performance of data-intensive computing pipelines in heterogeneous network environments

Journal of Parallel and Distributed Computing
Mapping workflow applications with types on heterogeneous specialized platforms

Parallel Computing
Models and complexity results for performance and energy optimization of concurrent streaming applications

International Journal of High Performance Computing Applications
Optimizing latency and throughput of application workflows on clusters

Parallel Computing
Modeling and simulation of distributed computing workflows in heterogeneous network environments

Simulation
Load balancing in homogeneous pipeline based applications

Parallel Computing
A Distributed Workflow Management System with Case Study of Real-life Scientific Applications on Grids

Journal of Grid Computing
Throughput optimization for pipeline workflow scheduling with setup times

Euro-Par'12 Proceedings of the 18th international conference on Parallel processing workshops
Reliability and performance optimization of pipelined real-time systems

Journal of Parallel and Distributed Computing
Enhancing throughput for streaming applications running on cluster systems

Journal of Parallel and Distributed Computing
A survey of pipelined workflow scheduling: Models and algorithms

ACM Computing Surveys (CSUR)
Distributed Throughput Optimization for Large-Scale Scientific Workflows Under Fault-Tolerance Constraint

Journal of Grid Computing
Multi-objective exploitation of pipeline parallelism using clustering, replication and duplication in embedded multi-core systems

Journal of Systems Architecture: the EUROMICRO Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

Mapping applications onto parallel platforms is a challenging problem, that becomes even more difficult when platforms are heterogeneous - nowadays a standard assumption. A high-level approach to parallel programming not only eases the application developer's task, but it also provides additional information which can help realize an efficient mapping of the application. In this paper, we discuss the mapping of pipeline skeletons onto different types of platforms: Fully Homogeneous platforms with identical processors and interconnection links; Communication Homogeneous platforms, with identical links but different-speed processors; and finally, Fully Heterogeneous platforms. We assume that a pipeline stage must be mapped on a single processor, and we establish new theoretical complexity results for different mapping policies: a mapping can be required to be one-to-one (a processor is assigned at most one stage), or interval-based (a processor is assigned an interval of consecutive stages), or fully general. In particular, we show that determining the optimal interval-based mapping is NP-hard for Communication Homogeneous platforms, and this result assesses the complexity of the well-known chains-to-chains problem for different-speed processors. We provide several efficient polynomial heuristics for the most important policy/platform combination, namely interval-based mappings on Communication Homogeneous platforms. These heuristics are compared to the optimal result provided by the formulation of the problem in terms of the solution of an integer linear program, for small problem instances.