Low-latency message communication support for the AP1000

Authors:
Toshiyuki Shimizu;Takeshi Horie;Hiroaki Ishihata
Affiliations:
-;-;-
Venue:
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Year:
1992

Citing 8
Cited 15

The cosmic cube

Communications of the ACM - Special section on computer architecture
iPSC/2 system: a second generation hypercube

C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
Mark IIIfp hypercube concurrent processor architecture

C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
A message passing coprocessor for distributed memory multicomputers

Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Virtual-channel flow control

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Supporting systolic and memory communication in iWarp

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Message-Driven Processor Architecture, Version 11

Message-Driven Processor Architecture, Version 11
Performance of Various Computers Using Standard Linear Equations Software

Performance of Various Computers Using Standard Linear Equations Software

An efficient implementation scheme of concurrent object-oriented languages on stock multicomputers

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Multiple threads in cyclic register windows

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Improving AP1000 parallel computer performance with message communication

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Concert-efficient runtime support for concurrent object-oriented programming languages on stock hardware

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
AP1000+: architectural support of PUT/GET interface for parallelizing compiler

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Compiling away the meta-level in object-oriented concurrent reflective languages using partial evaluation

Proceedings of the tenth annual conference on Object-oriented programming systems, languages, and applications
Circuit-Switched Broadcasting in Torus Networks

IEEE Transactions on Parallel and Distributed Systems
Coherent network interfaces for fine-grain communication

ISCA '96 Proceedings of the 23rd annual international symposium on Computer architecture
Satisfiability test with synchronous simulated annealing on the Fujitsu AP1000 massively-parallel multiprocessor

ICS '96 Proceedings of the 10th international conference on Supercomputing
OMPI: optimizing MPI programs using partial evaluation

Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Implementing Concurrent Object-Oriented Languages on Multicomputers

IEEE Parallel & Distributed Technology: Systems & Technology
Parallel N-ary Speculative Computation of Simulated Annealing

IEEE Transactions on Parallel and Distributed Systems
Deadlock-Free Fault-tolerant Routing in the Multi-dimensional Crossbar Network and Its Implementation for the Hitachi SR2201

IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
Dataflow-Based Lenient Implementation of a Functional Language, Valid, on Conventional Multi-processors

PACT '94 Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques
Performance and modularity benefits of message-driven execution

Journal of Parallel and Distributed Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Low-latency communication is the key to achieving a high-performance parallel computer. In using state-of-the-art processors, we must take cache memory into account. This paper presents an architecture for low-latency message comunication and implementation, and performance evaluation.We developed a message controller (MSC) to support low-latency message passing communication for the AP1000, to minimize message handling overhead. MSC sends messages directly from cache memory and automatically receives messages in the circular buffer. We designed communication functions between cells and evaluated communication performance by running benchmark programs such as the Pingpong benchmark, the LINPACK benchmark, the SLALOM benchmark, and a solver using the scaled conjugate gradient method.