Task Clustering and Scheduling for Distributed Memory Parallel Architectures
IEEE Transactions on Parallel and Distributed Systems
On Exploiting Task Duplication in Parallel Program Scheduling
IEEE Transactions on Parallel and Distributed Systems
Precedence-Constrained Task Allocation onto Point-to-Point Networks for Pipelined Execution
IEEE Transactions on Parallel and Distributed Systems
Partitioning and Scheduling Parallel Programs for Multiprocessors
Partitioning and Scheduling Parallel Programs for Multiprocessors
Distributed processing of very large datasets with DataCutter
Parallel Computing - Clusters and computational grids for scientific computing
Merrimac: Supercomputing with Streams
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Introduction to the cell multiprocessor
IBM Journal of Research and Development - POWER5 and packaging
Sequoia: programming the memory hierarchy
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Sequoia: programming the memory hierarchy
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
BlockLib: a skeleton library for cell broadband engine
Proceedings of the 1st international workshop on Multicore software engineering
Entering the petaflop era: the architecture and performance of Roadrunner
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
CellSs: Scheduling techniques to better exploit memory hierarchy
Scientific Programming - High Performance Computing with the Cell Broadband Engine
Efficient scheduling of task graph collections on heterogeneous resources
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
A component-based framework for the Cell Broadband Engine
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Exploiting DMA to enable non-blocking execution in Decoupled Threaded Architecture
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Language and compiler support for stream programs
Language and compiler support for stream programs
Scheduling of DSP programs onto multiprocessors for maximumthroughput
IEEE Transactions on Signal Processing
Hi-index | 0.00 |
In this paper, we consider the problem of scheduling streaming applications described by complex task graphs on a heterogeneous multicore platform, the IBM QS 22 platform, embedding two STI Cell Broadband Engine processor. We first derive a complete computation and communication model of the platform on the basis of comprehensive benchmarks. Then we use this model to express the problem of maximizing the throughput of a streaming application on this platform. Although the problem is proven NP-complete, we present an optimal solution based on mixed linear programming. We also propose simpler scheduling heuristics to compute mapping of the application task graph on the platform. We then come back to the platform and propose a scheduling software to deploy streaming applications on this platform. This allows us to thoroughly test our scheduling strategies on the real platform. We thus show that we are able to achieve a good speed-up either with the mixed linear programming solution or using involved scheduling heuristics. Copyright © 2011 John Wiley & Sons, Ltd.