MANNA: Prototype of a Distributed Memory Architecture with Maximized Sustained Performance

Authors:
W. K. Giloi;U. Bruening;W. Schroeder-Preikschat
Affiliations:
-;-;-
Venue:
PDP '96 Proceedings of the 4th Euromicro Workshop on Parallel and Distributed Processing (PDP '96)
Year:
1996

Citing 0
Cited 10

Experiences with non-numeric applications on multithreaded architectures

PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Hierarchical fuzzy configuration of implementation strategies

Proceedings of the 1999 ACM symposium on Applied computing
Asynchrony in parallel computing: from dataflow to multithreading

Progress in computer research
Asynchrony in parallel computing: from dataflow to multithreading

Progress in computer research
Exploiting Locality in Single Assignment Data Structures Updated Through Split-Phase Transactions

Cluster Computing
Compiling Several Classes of Communication Patterns on a Multithreaded Architecture

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Data locality sensitivity of multithreaded computations on a distributed-memory multiprocessor

CASCON '96 Proceedings of the 1996 conference of the Centre for Advanced Studies on Collaborative research
High-Level Data Parallel Programming in PROMOTER

HIPS '97 Proceedings of the 1997 Workshop on High-Level Programming Models and Supportive Environments (HIPS '97)
Dynamic load balancing efficiently in a large-scale cluster

International Journal of High Performance Computing and Networking
An efficient dynamic load-balancing algorithm in a large-scale cluster

ICA3PP'05 Proceedings of the 6th international conference on Algorithms and Architectures for Parallel Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Abstract: The sustained performance of superscalar microprocessors amounts to only a fraction of their peak performance rating. In parallel computers realized with them this discrepancy is even more dramatic. Reaching a satisfactory sustained performance for the single processor is mainly a compiler problem. The sustained performance of parallel computers depends also on other components of the architecture such as the interconnect and the operating system. It is shown how, through a combination of innovative architectural solutions, the sustained performance of a distributed memory parallel computer can be significantly improved. The key to effective latency hiding by overlapping communication and computation is the operating system. The programmability of such architectures can be enhanced by providing the programmer with parallelizing compilers and/or a global address space provided by virtual shared memory. All these measures have been incorporated in the MANNA computer described in the paper. Benchmark performance figures obtained with it are reported.