Efficient Implementation of High-Level Languages on User-Level Communications Architectures

Authors:
W. C. Hsieh;K. L. Johnson;M. F. Kaashoek;D. A. Wallach;W. E. Weihl
Affiliations:
-;-;-;-;-
Venue:
Efficient Implementation of High-Level Languages on User-Level Communications Architectures
Year:
1994

Citing 0
Cited 2

Optimistic active messages: a mechanism for scheduling communication with computation

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Using active messages to support shared objects

EW 6 Proceedings of the 6th workshop on ACM SIGOPS European workshop: Matching operating systems to application needs

Quantified Score

Hi-index	0.00

Visualization

Abstract

User-level communication architectures --- parallel architectures that give user code direct but protected access to the network --- provide communication performance that is an order of magnitude higher than previous-generation message-passing architectures. Unfortunately, in order to take advantage of this level of performance, programmers must concern themselves with low-level issues that are often hardware-dependent ({\em e.g.}, what primitives to use for large and small data transfers, and whether to use interrupts or polling). As a result, programs are difficult to design, implement, maintain, and port. New compiler and runtime system mechanisms are needed to allow programs written in high-level languages --- languages where the programmer does not orchestrate communication --- to achieve the full potential of user-level communication architectures. We propose a software architecture (compiler and runtime system) for implementing high-level languages with dynamic parallelism on user-level communication architectures. The compiler uses a simple runtime interface, and a new strategy called optimistic active messages to eliminate overhead due to context switching and thread creation. The runtime supports user-level message handlers and multithreading. We developed an implementation of the runtime for the CM-5; microbenchmarks demonstrate that our runtime has excellent base performance. We compare our compilation strategy and runtime with a portable runtime that uses traditional network interfaces. On our system, the microbenchmarks perform up to 30 times better; three hand-compiled applications run 10% to 50% faster. We also compare our approach on these applications with hand-crafted C programs that use active messages; the microbenchmarks perform within 25% of C, and almost all of the applications perform within a factor of two of C.