Interconnection networks for large-scale parallel processing: theory and case studies
Interconnection networks for large-scale parallel processing: theory and case studies
Communication effect basic linear algebra computations on hypercube architectures
Journal of Parallel and Distributed Computing
Parallel Algorithms for the Classes of +or-2^b DESCEND and ASCEND Computations on a SIMD Hypercube
IEEE Transactions on Parallel and Distributed Systems
Combinatorial Algorithms: Theory and Practice
Combinatorial Algorithms: Theory and Practice
Hi-index | 0.00 |
In a hypercube multiprocessor with distributed memory, each data element has a street address and an apartment number (i.e. a hypercube node address and a local memory address). We describe an optimal algorithm for performing the all-to-some personalized communication (ASPC) on Boolean n-cubes, defined as (i|j) 驴 (i 卤 2j|j), i 驴 [0; 2n - 1], j 驴 [0, n - 1] where (i|j) denote the data element on node i and location j. The algorithm also gives an optimal schedule for emulating PM2I networks on hypercubes under the binary-reflected Gray code encoding.We also study an important class of parallel algorithms, called 卤2b-descend, which perform log M iterations on an M-element input a[0: M - 1]. For b = log M - 1, ..., 0, iteration b computes new values of each a[i] as a function of a[i], a[i+2b], a[i-2b]. For large applications, the problem size M is typically much larger than the number of nodes N. We show that on hypercubes, the optimal ASPC algorithm devised in this paper can be used in combination with pipelining communication and computation in 卤2b- descend computations to reduce the communication steps from 2 驴 log N 驴 M/N to 4 (log M + M/N - 1). At one communication step, a hypercube node can send n elements along its n links, one per link.