IEEE Transactions on Parallel and Distributed Systems
Efficient scheduling of arbitrary task graphs to multiprocessors using a parallel genetic algorithm
Journal of Parallel and Distributed Computing - Special issue on parallel evolutionary computing
Genetic Scheduling for Parallel Processor Systems: Comparative Studies and Performance Issues
IEEE Transactions on Parallel and Distributed Systems
Scheduling Multiprocessor Tasks with Genetic Algorithms
IEEE Transactions on Parallel and Distributed Systems
Journal of VLSI Signal Processing Systems - Special issue on the 1997 IEEE workshop on signal processing systems (SiPS): design and implementation
Static scheduling algorithms for allocating directed task graphs to multiprocessors
ACM Computing Surveys (CSUR)
Code Generation for Embedded Processors
Code Generation for Embedded Processors
Hypertool: A Programming Aid for Message-Passing Systems
IEEE Transactions on Parallel and Distributed Systems
A Genetic Algorithm for Multiprocessor Scheduling
IEEE Transactions on Parallel and Distributed Systems
DSC: Scheduling Parallel Tasks on an Unbounded Number of Processors
IEEE Transactions on Parallel and Distributed Systems
LLB: A Fast and Effective Scheduling Algorithm for Distributed-Memory Systems
IPPS '99/SPDP '99 Proceedings of the 13th International Symposium on Parallel Processing and the 10th Symposium on Parallel and Distributed Processing
Multiprocessor Clustering for Embedded Systems
Euro-Par '01 Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing
EURO-PDP'00 Proceedings of the 8th Euromicro conference on Parallel and distributed processing
A generalized static data flow clustering algorithm for mpsoc scheduling of multimedia applications
EMSOFT '08 Proceedings of the 8th ACM international conference on Embedded software
Optimized on-chip pipelining of memory-intensive computations on the cell BE
ACM SIGARCH Computer Architecture News
A New Genetic Algorithm for Scheduling for Large Communication Delays
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Manycore performance analysis using timed configuration graphs
SAMOS'09 Proceedings of the 9th international conference on Systems, architectures, modeling and simulation
Embedding of tori and grids into twisted cubes
Theoretical Computer Science
Analysis of SystemC actor networks for efficient synthesis
ACM Transactions on Embedded Computing Systems (TECS)
Multithreaded Simulation for Synchronous Dataflow Graphs
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Geodesic pancyclicity of twisted cubes
Information Sciences: an International Journal
A rule-based quasi-static scheduling approach for static islands in dynamic dataflow graphs
ACM Transactions on Embedded Computing Systems (TECS)
Reducing the solution space of optimal task scheduling
Computers and Operations Research
Hi-index | 0.00 |
Multiprocessor mapping and scheduling algorithms have been extensively studied over the past few decades and have been tackled from different perspectives. In the late 1980's, the two-step decomposition of scheduling—into clustering and cluster-scheduling—was introduced. Ever since, several clustering and merging algorithms have been proposed and individually reported to be efficient. However, it is not clear how effective they are and how well they compare against single-step scheduling algorithms or other multistep algorithms. In this paper, we explore the effectiveness of the two-phase decomposition of scheduling and describe efficient and novel techniques that aggressively streamline interprocessor communications and can be tuned to exploit the significantly longer compilation time that is available to embedded system designers. We evaluate a number of leading clustering and merging algorithms using a set of benchmarks with diverse structures. We present an experimental setup for comparing the single-step against the two-step scheduling approach. We determine the importance of different steps in scheduling and the effect of different steps on overall schedule performance and show that the decomposition of the scheduling process indeed improves the overall performance. We also show that the quality of the solutions depends on the quality of the clusters generated in the clustering step. Based on the results, we also discuss why the parallel time metric in the clustering step may not provide an accurate measure for the final performance of cluster-scheduling.