Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Fundamental parallel algorithms for private-cache chip multiprocessors
Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
Cache-oblivious simulation of parallel programs
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Hi-index | 0.00 |
A 'recursively decomposable' network G can be partitioned into a fixed number of subnetworks each of which is recursively decomposable and 'a smaller version' of G. Several notions of such networks emerge depending on the collection of parameters chosen to model a subnetwork as 'a smaller version' of another. Examples of such parameters are permutation time, bandwidth latency, topology, wires, degree, and size. This paper introduces and studies the class of networks that are recursively decomposable relative to bandwidth-inefficiency limitations, and the subclass of 'recurrent networks' that are recursively decomposable relative to topology limitations. We prove lower bounds on the number of wires in a recursively decomposable network with a given bandwidth-inefficiency function. We show these bounds are tight by exhibiting recurrent networks that meet them. Linear arrays, hypercubes, and completely-connected networks are shown to be exactly optimal for networks with their respective bandwidth-inefficiency functions. We generalize our results to processor-networks such as trees and butterflies-which are 'almost' recursive decomposable in that only a subset of 'core' processors survives the decomposition process. The core of the tree consists of its leaves. We derive tradeoffs between degree, bandwidth-inefficiency, and core-inefficiency. N-processors butterfly networks are shown to have essentially optimal core for fixed-degree networks with bandwidth-inefficiency function /spl Theta/(log N).