Evaluation of architectural support for global address-based communication in large-scale parallel machines

Authors:
Arvind Krishnamurthy;Klaus E. Schauser;Chris J. Scheiman;Randolph Y. Wang;David E. Culler;Katherine Yelick
Affiliations:
Computer Science Division, University of California, Berkeley, CA;Department of Computer Science, University of California, Santa Barbara, CA;Department of Computer Science, University of California, Santa Barbara, CA;Computer Science Division, University of California, Berkeley, CA;Computer Science Division, University of California, Berkeley, CA;Computer Science Division, University of California, Berkeley, CA
Venue:
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Year:
1996

Citing 25
Cited 4

Memory coherence in shared virtual memory systems

ACM Transactions on Computer Systems (TOCS)
Implementation and performance of Munin

SOSP '91 Proceedings of the thirteenth ACM symposium on Operating systems principles
Alpha architecture reference manual

Alpha architecture reference manual
Active messages: a mechanism for integrated communication and computation

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
The network architecture of the Connection Machine CM-5 (extended abstract)

SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
LogP: towards a realistic model of parallel computation

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Anatomy of a message in the Alewife multiprocessor

ICS '93 Proceedings of the 7th international conference on Supercomputing
Efficient software-based fault isolation

SOSP '93 Proceedings of the fourteenth ACM symposium on Operating systems principles
Parallel programming in Split-C

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Message passing on the Meiko CS-2

Parallel Computing - Special issue: message passing interfaces
The Stanford FLASH multiprocessor

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Tempest and typhoon: user-level shared memory

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
U-Net: a user-level network interface for parallel and distributed computing

SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Extensibility safety and performance in the SPIN operating system

SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Empirical evaluation of the CRAY-T3D: a compiler perspective

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
TreadMarks: Shared Memory Computing on Networks of Workstations

Computer
The directory-based cache coherence protocol for the DASH multiprocessor

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Myrinet: A Gigabit-per-Second Local Area Network

IEEE Micro
A Case for NOW (Networks of Workstations)

IEEE Micro
Assessing Fast Network Interfaces

IEEE Micro
Exploiting the Capabilities of Communications Co-Processors

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Experience with active messages on the Meiko CS-2

IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
Compositional C++: Compositional Parallel Programming

Proceedings of the 5th International Workshop on Languages and Compilers for Parallel Computing
Early Experiences with Olden

Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
Cid: A Parallel, "Shared-Memory" C for Distributed-Memory Machines

LCPC '94 Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing

pSNOW: a tool to evaluate architectural issues for NOW environments

ICS '97 Proceedings of the 11th international conference on Supercomputing
Evaluating design alternatives for reliable communication on high-speed networks

ACM SIGPLAN Notices
Evaluating design alternatives for reliable communication on high-speed networks

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Cluster communication protocols for parallel-programming systems

ACM Transactions on Computer Systems (TOCS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Large-scale parallel machines are incorporating increasingly sophisticated architectural support for user-level messaging and global memory access. We provide a systematic evaluation of a broad spectrum of current design alternatives based on our implementations of a global address language on the Thinking Machines CM-5, Intel Paragon, Meiko CS-2, Cray T3D, and Berkeley NOW. This evaluation includes a range of compilation strategies that make varying use of the network processor; each is optimized for the target architecture and the particular strategy. We analyze a family of interacting issues that determine the performance trade-offs in each implementation, quantify the resulting latency, overhead, and bandwidth of the global access operations, and demonstrate the effects on application performance.