On the Communication Complexity of Generalized 2-D Convolution on Array Processors

Authors:
Z. Fang;X. Li;L. M. Ni
Affiliations:
Concurrent Computer Corp.;Univ. of Alberta, Edmonton, Alta., Canada;Michigan State Univ., East Lansing
Venue:
IEEE Transactions on Computers
Year:
1989

Citing 7
Cited 9

The cosmic cube

Communications of the ACM - Special section on computer architecture
The connection machine

The connection machine
Fundamentals of Logic Design

Fundamentals of Logic Design
Computer Architecture and Parallel Processing

Computer Architecture and Parallel Processing
Computer Vision

Computer Vision
Special Computer Architectures for Pattern Processing

Special Computer Architectures for Pattern Processing
Mathematical theory of multistage interconnection networks analysis

Mathematical theory of multistage interconnection networks analysis

A network-topology independent task allocation strategy for parallel computers

Proceedings of the 1990 ACM/IEEE conference on Supercomputing
IPF for real-time image processing on massively parallel architectures

PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Computer Vision Algorithms on Reconfigurable Logic Arrays

IEEE Transactions on Parallel and Distributed Systems
Pipelined Data Parallel Algorithms-I: Concept and Modeling

IEEE Transactions on Parallel and Distributed Systems
A Sliding Memory Plane Array Processor

IEEE Transactions on Parallel and Distributed Systems
Implementation of a SliM Array Processor

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
HPC-Colony: services and interfaces for very large systems

ACM SIGOPS Operating Systems Review
Paper: Nearest neighbor classification on two types of SIMD machines

Parallel Computing
Topology-aware task mapping for reducing communication contention on large parallel machines

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing

Quantified Score

Hi-index	14.98

Visualization

Abstract

Several parallel convolution algorithms for array processors with N/sup 2/ processing elements (PEs) connected by mesh, hypercube, and shuffle-exchange topologies, respectively, are presented. The computation time complexity is the same for array processors with different interconnection networks. The communication time complexity, however, varies from network to network, and is the main focus. It is shown that by using inter-PE communication networks efficiently, each PE requires only a small local memory, many unnecessary data transmissions are eliminated, and the overall time complexity (including computation and communication) of algorithms is reduced to O(M/sup 2/).