Distributed Computing
Chare kernel—a runtime support system for parallel computations
Journal of Parallel and Distributed Computing
Active messages: a mechanism for integrated communication and computation
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
CHARM++: a portable concurrent object oriented system based on C++
OOPSLA '93 Proceedings of the eighth annual conference on Object-oriented programming systems, languages, and applications
The design and evolution of C++
The design and evolution of C++
Optimistic active messages: a mechanism for scheduling communication with computation
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
ACM Transactions on Programming Languages and Systems (TOPLAS)
Remote Procedure Calls and Java Remote Method Invocation
IEEE Concurrency
A Parallelization of Dijkstra's Shortest Path Algorithm
MFCS '98 Proceedings of the 23rd International Symposium on Mathematical Foundations of Computer Science
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
GASNet Specification, v1.1
Δ-stepping: a parallelizable shortest path algorithm
Journal of Algorithms
Myrinet eXpress (MX): Is Your Interconnect Smart?
HPCASIA '04 Proceedings of the High Performance Computing and Grid in Asia Pacific Region, Seventh International Conference
Co-arrays in the next Fortran Standard
ACM SIGPLAN Fortran Forum
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Implementation and performance analysis of non-blocking collective operations for MPI
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Proceedings of the 22nd annual international conference on Supercomputing
ParalleX An Advanced Parallel Execution Model for Scaling-Impaired Applications
ICPPW '09 Proceedings of the 2009 International Conference on Parallel Processing Workshops
Scalable communication protocols for dynamic sparse data exchange
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
VM-based slack emulation of large-scale systems
Proceedings of the 1st International Workshop on Runtime and Operating Systems for Supercomputers
Active pebbles: parallel programming for data-driven applications
Proceedings of the international conference on Supercomputing
Writing parallel libraries with MPI - common practice, issues, and extensions
EuroMPI'11 Proceedings of the 18th European MPI Users' Group conference on Recent advances in the message passing interface
Virtual-machine-based emulation of future generation high-performance computing systems
International Journal of High Performance Computing Applications
Avalanche: a fine-grained flow graph model for irregular applications on distributed-memory systems
Proceedings of the 1st ACM SIGPLAN workshop on Functional high-performance computing
Adoption protocols for fanout-optimal fault-tolerant termination detection
Proceedings of the 18th ACM SIGPLAN symposium on Principles and practice of parallel programming
Expressing graph algorithms using generalized active messages
Proceedings of the 27th international ACM conference on International conference on supercomputing
2HOT: an improved parallel hashed oct-tree n-body algorithm for cosmological simulation
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.00 |
Active messages have proven to be an effective approach for certain communication problems in high performance computing. Many MPI implementations, as well as runtimes for Partitioned Global Address Space languages, use active messages in their low-level transport layers. However, most active message frameworks have low-level programming interfaces that require significant programming effort to use directly in applications and that also prevent optimization opportunities. In this paper we present AM++, a new user-level library for active messages based on generic programming techniques. Our library allows message handlers to be run in an explicit loop that can be optimized and vectorized by the compiler and that can also be executed in parallel on multicore architectures. Runtime optimizations, such as message combining and filtering, are also provided by the library, removing the need to implement that functionality at the application level. Evaluation of AM++ with distributed-memory graph algorithms shows the usability benefits provided by these library features, as well as their performance advantages.