Efficient Implementation of High-Level Languages on User-Level Communications Architectures

  • Authors:
  • W. C. Hsieh;K. L. Johnson;M. F. Kaashoek;D. A. Wallach;W. E. Weihl

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • Efficient Implementation of High-Level Languages on User-Level Communications Architectures
  • Year:
  • 1994

Quantified Score

Hi-index 0.00

Visualization

Abstract

User-level communication architectures --- parallel architectures that give user code direct but protected access to the network --- provide communication performance that is an order of magnitude higher than previous-generation message-passing architectures. Unfortunately, in order to take advantage of this level of performance, programmers must concern themselves with low-level issues that are often hardware-dependent ({\em e.g.}, what primitives to use for large and small data transfers, and whether to use interrupts or polling). As a result, programs are difficult to design, implement, maintain, and port. New compiler and runtime system mechanisms are needed to allow programs written in high-level languages --- languages where the programmer does not orchestrate communication --- to achieve the full potential of user-level communication architectures. We propose a software architecture (compiler and runtime system) for implementing high-level languages with dynamic parallelism on user-level communication architectures. The compiler uses a simple runtime interface, and a new strategy called optimistic active messages to eliminate overhead due to context switching and thread creation. The runtime supports user-level message handlers and multithreading. We developed an implementation of the runtime for the CM-5; microbenchmarks demonstrate that our runtime has excellent base performance. We compare our compilation strategy and runtime with a portable runtime that uses traditional network interfaces. On our system, the microbenchmarks perform up to 30 times better; three hand-compiled applications run 10% to 50% faster. We also compare our approach on these applications with hand-crafted C programs that use active messages; the microbenchmarks perform within 25% of C, and almost all of the applications perform within a factor of two of C.