AM++: a generalized active message framework

Authors:
Jeremiah James Willcock;Torsten Hoefler;Nicholas Gerard Edmonds;Andrew Lumsdaine
Affiliations:
Indiana University, Bloomington, IN, USA;University of Illinois at Urbana-Champaign, Urbana, IL, USA;Indiana University, Bloomington, IN, USA;Indiana University, Bloomington, IN, USA
Venue:
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Year:
2010

Citing 19
Cited 8

How processes learn

Distributed Computing
Chare kernel—a runtime support system for parallel computations

Journal of Parallel and Distributed Computing
Active messages: a mechanism for integrated communication and computation

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
CHARM++: a portable concurrent object oriented system based on C++

OOPSLA '93 Proceedings of the eighth annual conference on Object-oriented programming systems, languages, and applications
The design and evolution of C++

The design and evolution of C++
Optimistic active messages: a mechanism for scheduling communication with computation

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Distributed Termination

ACM Transactions on Programming Languages and Systems (TOPLAS)
Remote Procedure Calls and Java Remote Method Invocation

IEEE Concurrency
A Parallelization of Dijkstra's Shortest Path Algorithm

MFCS '98 Proceedings of the 23rd International Symposium on Mathematical Foundations of Computer Science
Performance and Experience with LAPI -- A New High-Performance Communication Library for the IBM RS/6000 SP

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
GASNet Specification, v1.1

GASNet Specification, v1.1
Δ-stepping: a parallelizable shortest path algorithm

Journal of Algorithms
Myrinet eXpress (MX): Is Your Interconnect Smart?

HPCASIA '04 Proceedings of the High Performance Computing and Grid in Asia Pacific Region, Seventh International Conference
Co-arrays in the next Fortran Standard

ACM SIGPLAN Fortran Forum
MapReduce: simplified data processing on large clusters

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Implementation and performance analysis of non-blocking collective operations for MPI

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
The deep computing messaging framework: generalized scalable message passing on the blue gene/P supercomputer

Proceedings of the 22nd annual international conference on Supercomputing
ParalleX An Advanced Parallel Execution Model for Scaling-Impaired Applications

ICPPW '09 Proceedings of the 2009 International Conference on Parallel Processing Workshops
Scalable communication protocols for dynamic sparse data exchange

Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

VM-based slack emulation of large-scale systems

Proceedings of the 1st International Workshop on Runtime and Operating Systems for Supercomputers
Active pebbles: parallel programming for data-driven applications

Proceedings of the international conference on Supercomputing
Writing parallel libraries with MPI - common practice, issues, and extensions

EuroMPI'11 Proceedings of the 18th European MPI Users' Group conference on Recent advances in the message passing interface
Virtual-machine-based emulation of future generation high-performance computing systems

International Journal of High Performance Computing Applications
Avalanche: a fine-grained flow graph model for irregular applications on distributed-memory systems

Proceedings of the 1st ACM SIGPLAN workshop on Functional high-performance computing
Adoption protocols for fanout-optimal fault-tolerant termination detection

Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
Expressing graph algorithms using generalized active messages

Proceedings of the 27th international ACM conference on International conference on supercomputing
2HOT: an improved parallel hashed oct-tree n-body algorithm for cosmological simulation

SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Active messages have proven to be an effective approach for certain communication problems in high performance computing. Many MPI implementations, as well as runtimes for Partitioned Global Address Space languages, use active messages in their low-level transport layers. However, most active message frameworks have low-level programming interfaces that require significant programming effort to use directly in applications and that also prevent optimization opportunities. In this paper we present AM++, a new user-level library for active messages based on generic programming techniques. Our library allows message handlers to be run in an explicit loop that can be optimized and vectorized by the compiler and that can also be executed in parallel on multicore architectures. Runtime optimizations, such as message combining and filtering, are also provided by the library, removing the need to implement that functionality at the application level. Evaluation of AM++ with distributed-memory graph algorithms shows the usability benefits provided by these library features, as well as their performance advantages.