Processing Element Design for a Parallel Computer

Authors:
Katsuyuki Kaneko;Masaitsu Nakajima;Yasuhiro Kakakura;Junji Nishikawa;Ichiro Okabayashi;Hiroshi Kadota
Affiliations:
-;-;-;-;-;-
Venue:
IEEE Micro
Year:
1990

Citing 7
Cited 4

A microprocessor-based hypercube supercomputer

IEEE Micro
An Invitation to the World of PAX

Computer
The birth of the second generation: the Hitachi S-820/80

Proceedings of the 1988 ACM/IEEE conference on Supercomputing
Interprocessor communication speed and performance in distributed-memory parallel processors

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
PACS: a parallel microprocessor array for scientific calculations

ACM Transactions on Computer Systems (TOCS)
The MIPS R3010 Floating-Point Coprocessor

IEEE Micro
DAP—a distributed array processor

ISCA '73 Proceedings of the 1st annual symposium on Computer architecture

OHMEGA: a VLSI superscalar processor architecture for numerical applications

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Network-based multicomputers: an emerging parallel architecture

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Architecture and evaluation of a high-speed networking subsystem for distributed-memory systems

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
A high-speed network interface for distributed-memory systems: architecture and applications

ACM Transactions on Computer Systems (TOCS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

A study has been made of how cost-effectiveness due to the improvement of VLSI technology can apply to a scientific computer system without performance loss. The result is a parallel computer, ADENA (Alternating Direction Edition Nexus Array), with a core consisting of four kinds of VLSI chips, two for processor elements (PES) and two for the interprocessor network (plus some memory chips). An overview of ADENA and an analysis of its performance are given. The design considerations for the PEs incorporated in ADENA are discussed. The factors that limit performance in a parallel processing environment are analyzed, and the measures employed to improve these factors at the LSI design level are described. The 42.6 sq cm CMOS PEs reach a peak performance of 20 MFLOPS and a 256-PE ADENA 1.5 GFLOPS has been achieved and 300 to 400 MFLOPS for PDE applications.