Optimistic active messages: a mechanism for scheduling communication with computation

Authors:
Deborah A. Wallach;Wilson C. Hsieh;Kirk L. Johnson;M. Frans Kaashoek;William E. Weihl
Affiliations:
M.I.T. Laboratory for Computer Science, Cambridge MA;M.I.T. Laboratory for Computer Science, Cambridge MA;M.I.T. Laboratory for Computer Science, Cambridge MA;M.I.T. Laboratory for Computer Science, Cambridge MA;M.I.T. Laboratory for Computer Science, Cambridge MA
Venue:
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Year:
1995

Citing 22
Cited 24

Experience with CST: programming and implementation

PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Fine-grain parallelism with minimal hardware support: a compiler-controlled threaded abstract machine

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Orca: A Language for Parallel Programming of Distributed Systems

IEEE Transactions on Software Engineering
SPLASH: Stanford parallel applications for shared-memory

ACM SIGARCH Computer Architecture News
Active messages: a mechanism for integrated communication and computation

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
The network architecture of the Connection Machine CM-5 (extended abstract)

SPAA '92 Proceedings of the fourth annual ACM symposium on Parallel algorithms and architectures
Integrating message-passing and shared-memory: early experience

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Parallel programming in Split-C

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Concert-efficient runtime support for concurrent object-oriented programming languages on stock hardware

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Separating data and control transfer in distributed operating systems

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Cilk: an efficient multithreaded runtime system

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
The MIT Alewife machine: architecture and performance

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
The Tera computer system

ICS '90 Proceedings of the 4th international conference on Supercomputing
Optimistic active messages: structuring systems for high-performance communication

EW 6 Proceedings of the 6th workshop on ACM SIGOPS European workshop: Matching operating systems to application needs
Using active messages to support shared objects

EW 6 Proceedings of the 6th workshop on ACM SIGOPS European workshop: Matching operating systems to application needs
The Message-Driven Processor: A Multicomputer Processing Node with Efficient Mechanisms

IEEE Micro
Lazy Task Creation: A Technique for Increasing the Granularity of Parallel Programs

IEEE Transactions on Parallel and Distributed Systems
How to Get Good Performance from the CM-5 Data Network

Proceedings of the 8th International Symposium on Parallel Processing
PRELUDE: A SYSTEM FOR PORTABLE PARALL

PRELUDE: A SYSTEM FOR PORTABLE PARALL
Efficient Implementation of High-Level Languages on User-Level Communications Architectures

Efficient Implementation of High-Level Languages on User-Level Communications Architectures
A Concurrent Smalltalk Compiler for the Message-Driven Processor

A Concurrent Smalltalk Compiler for the Message-Driven Processor
Concurrent Aggregates (CA): an Object-Orinted Language for Fine- Grained Message-Passing Machines

Concurrent Aggregates (CA): an Object-Orinted Language for Fine- Grained Message-Passing Machines

Evaluating the locality benefits of active messages

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Remote queues: exposing message queues for optimization and atomicity

Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
Distributed Shared Abstractions (DSA) on Multiprocessors

IEEE Transactions on Software Engineering
Teapot: language support for writing memory coherence protocols

PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Interpartition communication with shared active packages

Proceedings of the conference on TRI-Ada '96: disciplined software development with Ada
ASHs: Application-specific handlers for high-performance messaging

Conference proceedings on Applications, technologies, architectures, and protocols for computer communications
An efficient implementation of Java's remote method invocation

Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Coordinated CPU and event scheduling for distributed multimedia applications

MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
Efficient Java RMI for parallel programming

ACM Transactions on Programming Languages and Systems (TOPLAS)
Evaluating the performance limitations of MPMD communication

SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Supporting parallel applications on clusters of workstations: The Virtual Communication Machine-based architecture

Cluster Computing
Models for Asynchronous Message Handling

IEEE Parallel & Distributed Technology: Systems & Technology
P-RIO: A Modular Parallel-Programming Environment

IEEE Concurrency
Client-Server Computing on Shrimp

IEEE Micro
Parallel Programming through Configurable Interconnectable Objects

HIPS '97 Proceedings of the 1997 Workshop on High-Level Programming Models and Supportive Environments (HIPS '97)
High-speed I/O: the operating system as a signalling mechanism

NICELI '03 Proceedings of the ACM SIGCOMM workshop on Network-I/O convergence: experience, lessons, implications
Performance and modularity benefits of message-driven execution

Journal of Parallel and Distributed Computing
Flexible cross-domain event delivery for quality-managed multimedia applications

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Signals, timers, and continuations for multithreaded user-level protocols

Software—Practice & Experience - Research Articles
Cheating the I/O bottleneck: network storage with Trapeze/Myrinet

ATEC '98 Proceedings of the annual conference on USENIX Annual Technical Conference
A (condensed) parametric study of optimistic computation in wide-area, distributed environments

Proceedings of the 15th ACM Mardi Gras conference: From lightweight mash-ups to lambda grids: Understanding the spectrum of distributed computing requirements, applications, tools, infrastructures, interoperability, and the incremental adoption of key capabilities
Efficient, portable implementation of asynchronous multi-place programs

Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
AM++: a generalized active message framework

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
A moving threads processor architecture MTPA

The Journal of Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Low-overhead message passing is critical to the performance of many applications. Active Messages reduce the software overhead for message handling: messages are run as handlers instead of as threads, which avoids the overhead of thread management and the unnecessary data copying of other communication models. Scheduling the execution of Active Messages is typically done by disabling and enabling interrupts, or by polling the network. This primitive scheduling control, combined with the fact that handlers are not schedulable entities, puts severe restrictions on the code that can be run in a message handler. This paper describes a new software mechanism, Optimistic Active Messages (OAM), that eliminates these restrictions; OAMs allow arbitrary user code to execute in handlers, and also allow handlers to block. Despite this gain in expressiveness, OAMs perform as well as Active Messages.We used OAM as the base for an RPC system, Optimistic RPC (ORPC), for the Thinking Machines CM-5 multiprocessor; it consists of an optimized thread package and a stub compiler that hides communication details from the programmer. ORPC is 1.5 to 5 times faster than traditional RPC (TRPC) for small messages and performs as well as Active Messages (AM). Applications that primarily communicate using large data transfers or are fairly coarse-grained perform equally well, independent of whether AMs, ORPCs, or TRPCs are used. For applications that send many short messages, however, the ORPC and AM implementations are up to three times faster than the TRPC implementations. Using ORPC, programmers obtain the benefits of well-proven programming abstractions such as threads, mutexes, and condition variables, do not have to be concerned with communication details, and yet obtain nearly the performance of hand-coded Active Message programs.