Optimized on-chip pipelining of memory-intensive computations on the cell BE
ACM SIGARCH Computer Architecture News
Hi-index | 0.00 |
The problem of partitioning task graphs in its general form is known to be NP-complete and it is extremely difficult to come up with simple but effective and fast heuristics too. In this paper, the tree task graphs are considered which arise from many important programming paradigms such as divide and conquer branch and bound, etc. The target architecture considered is shared memory architecture as it typifies a wide range of research prototypes and commercial products of multiprocessors. Optimal sequential and parallel algorithms for partitioning tree task graphs such that the bottleneck is minimized "with minimum number of processors are developed. The bandwidth minimization problem is NP-complete even for trees. Three effective, simple 2-pass heuristics and their parallel versions are given. The effectiveness and efficiency of those heuristics are validated through extensive simulations.