Assignment problems in parallel and distributed computing
Assignment problems in parallel and distributed computing
The architecture and programming of the Ametek series 2010 multicomputer
C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
Performance Analysis of k-ary n-cube Interconnection Networks
IEEE Transactions on Computers
Communication in network architectures
VLSI and parallel computation
Vector models for data-parallel computing
Vector models for data-parallel computing
Algorithmic skeletons: structured management of parallel computation
Algorithmic skeletons: structured management of parallel computation
A data structure for manipulating priority queues
Communications of the ACM
GRAPH EMBEDDINGS 198: Recent Breakthroughs, New Directions
GRAPH EMBEDDINGS 198: Recent Breakthroughs, New Directions
Computational Aspects of VLSI
Dynamic Partitioning of the Divide-and-Conquer Scheme with Migration in PVM Environment
Proceedings of the 8th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Development and Tuning of Irregular Divide-and-Conquer Applications in DAMPVM/DAC
Proceedings of the 9th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Journal of Systems Architecture: the EUROMICRO Journal
Generalized parallel divide and conquer on 3D mesh and torus
Journal of Systems Architecture: the EUROMICRO Journal
Hi-index | 0.00 |
We address the problem of mapping divide-and-conquer programs to mesh connected multicomputers with wormhole or store-and-forward routing. We propose the binomial tree as an efficient model of parallel divide-and-conquer and present two mappings of the binomial tree to the 2D mesh. Our mappings exploit regularity in the communication structure of the divide-and-conquer computation and are also sensitive to the underlying flow control scheme of the target architecture. We evaluate these mappings using new metrics which are extensions of the classical notions of dilation and contention. We introduce the notion of communication slowdown as a measure of the total communication overhead incurred by a parallel computation. We conclude that significant performance gains can be realized when the mapping is sensitive to the flow control scheme of the target architecture.