Achieving Robustness and Minimizing Overhead in Parallel Algorithms Through Overlapped Communication/Computation

Authors:
Arun K. Somani;Allen M. Sansano
Affiliations:
Electrical and Computer Engineering, Iowa State University, Ames, IA 50011-3060 arun@iastate.edu;C-Cube Microsystems, 1778 McCarthy Blvd, Milpitas, CA 95035 asansano@c-cube.com
Venue:
The Journal of Supercomputing - Special issue on embedded fault-tolerance systems
Year:
2000

Citing 14
Cited 0

High-performance computer architecture

High-performance computer architecture
Algorithms for matrix transposition on Boolean N-cube configured ensemble architecture

SIAM Journal on Matrix Analysis and Applications
Optimum Broadcasting and Personalized Communication in Hypercubes

IEEE Transactions on Computers
Optimal communication algorithms for hypercubes

Journal of Parallel and Distributed Computing
The Stanford Dash Multiprocessor

Computer
Rearrangeable circuit-switched hypercube architectures for routing permutations

Journal of Parallel and Distributed Computing
The Stanford FLASH multiprocessor

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Integration of message passing and shared memory in the Stanford FLASH multiprocessor

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
On characterizing bandwidth requirements of parallel applications

Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Numerical Recipes in C: The Art of Scientific Computing

Numerical Recipes in C: The Art of Scientific Computing
Multiphase Complete Exchange on Paragon, SP2, and CS-2

IEEE Parallel & Distributed Technology: Systems & Technology
Synchronizing Hypercube Networks in the Presence of Faults

IEEE Transactions on Computers
Latency Hiding in Message-Passing Architectures

Proceedings of the 8th International Symposium on Parallel Processing
Cache write generate for parallel image processing on shared memory architectures

IEEE Transactions on Image Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

One of the major goals in the design of parallel processing machines and algorithms is to achieve robustness and reduce the effects of the overhead introduced when a given problem is parallelized or a fault occurs. A key contributor to overhead is communication time, in particular when a node is faulty and another node is substuiting for its operation. Many architectures try to reduce this overhead by minimizing the actual time for a communication to occur, including latency and bandwidth figures. Another approach is to hide communication by overlapping it with computation assuming that the computation is the most prominent factor. This paper presents the mechanisms provided in the Proteus parallel computer and its effective use of communication hiding through overlapping communication/computation techniques with and without the presence of a fault. These techniques are easily extended for use in compiler support of parallel programming. We also address the complexity (or rather simplicity) in achieving complete exchange on the Proteus Machine.