Improving AP1000 parallel computer performance with message communication

Authors:
Takeshi Horie;Kenichi Hayashi;Toshiyuki Shimizu;Hiroaki Ishihata
Affiliations:
-;-;-;-
Venue:
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Year:
1993

Citing 17
Cited 6

Architecture of a message-driven processor

ISCA '87 Proceedings of the 14th annual international symposium on Computer architecture
The architecture and programming of the Ametek series 2010 multicomputer

C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
iPSC/2 system: a second generation hypercube

C3P Proceedings of the third conference on Hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues - Volume 1
An architecture of a dataflow single chip processor

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Interprocessor communication speed and performance in distributed-memory parallel processors

ISCA '89 Proceedings of the 16th annual international symposium on Computer architecture
Performance Analysis of k-ary n-cube Interconnection Networks

IEEE Transactions on Computers
A message passing coprocessor for distributed memory multicomputers

Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Comparative evaluation of latency reducing and tolerating techniques

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
Data-parallel programming on MIMD computers

Data-parallel programming on MIMD computers
T: a multithreaded massively parallel architecture

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Improved multithreading techniques for hiding communication latency in multiprocessors

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Active messages: a mechanism for integrated communication and computation

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Low-latency message communication support for the AP1000

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
The impact of communication locality on large-scale multiprocessor performance

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
Evaluation of compiler generated parallel programs on three multicomputers

ICS '92 Proceedings of the 6th international conference on Supercomputing
Performance Measurement and Trace Driven Simulation of Parallel CAD and Numeric Applications on a Hypercube Multicomputer

IEEE Transactions on Parallel and Distributed Systems
Design and Implementation of an Interconnection Network for the AP1000

Proceedings of the IFIP 12th World Computer Congress on Algorithms, Software, Architecture - Information Processing '92, Volume 1 - Volume I

AP1000+: architectural support of PUT/GET interface for parallelizing compiler

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Performance prediction of parallel systems with scalable specifications—methodology and case study

ACM SIGMETRICS Performance Evaluation Review
Satisfiability test with synchronous simulated annealing on the Fujitsu AP1000 massively-parallel multiprocessor

ICS '96 Proceedings of the 10th international conference on Supercomputing
A Loop Transformation Algorithm for Communication Overlapping

International Journal of Parallel Programming - Special issue on international symposium on high performance computing 1997, part I
Parallel N-ary Speculative Computation of Simulated Annealing

IEEE Transactions on Parallel and Distributed Systems
Event-Based Study of the Effect of Execution Environments on Parallel Program Performance

MASCOTS '96 Proceedings of the 4th International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

The performance of message-passing applications depends on cpu speed, communication throughput and latency, and message handling overhead. In this paper we investigate the effect of varying these parameters and applying techniques to reduce message handling overhead on the execution efficiency of ten different applications. Using a message level simulator set up for the architecture of the AP1000, we showed that improving communication performance, especially message handling, improves total performance. If a cpu that is 32 times faster is provided, the total performance increases by less than ten times unless message handling overhead is reduced. Overlapping computation with message reception improves performance significantly. We also discuss how to improve the AP1000 architecture.