Design and implementation of FMPL, a fast message-passing library for remote memory operations

Authors:
Osamu Tatebe;Umpei Nagashima;Satoshi Sekiguchi;Hisayoshi Kitabayashi;Yoshiyuki Hayashida
Affiliations:
National Institute of Advanced Industrial Science and Technology, Tsukuba, Japan;National Institute of Advanced Industrial Science and Technology, Tsukuba, Japan;National Institute of Advanced Industrial Science and Technology, Tsukuba, Japan;Hitachi Business Solution;Hitachi, Ltd., Software Division
Venue:
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Year:
2001

Citing 18
Cited 2

I-structures: data structures for parallel computing

ACM Transactions on Programming Languages and Systems (TOPLAS)
Active messages: a mechanism for integrated communication and computation

ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
A scalar architecture for pseudo vector processing based on slide-windowed registers

ICS '93 Proceedings of the 7th international conference on Supercomputing
The EM-X parallel computer: architecture and basic performance

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
High performance messaging on workstations: Illinois fast messages (FM) for Myrinet

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
A high-performance, portable implementation of the MPI message passing interface standard

Parallel Computing
MPI-FM: high performance MPI on workstation clusters

Journal of Parallel and Distributed Computing - Special issue on workstation clusters and network-based computing
CP-PACS: a massively parallel processor for large scale scientific calculations

ICS '97 Proceedings of the 11th international conference on Supercomputing
ScaLAPACK user's guide

ScaLAPACK user's guide
The design and implementation of zero copy MPI using commodity hardware with a high performance network

ICS '98 Proceedings of the 12th international conference on Supercomputing
MBCF: a protected and virtualized high-speed user-level memory-based communication facility

ICS '98 Proceedings of the 12th international conference on Supercomputing
Highly efficient implementation of MPI point-to-point communication using remote memory operations

ICS '98 Proceedings of the 12th international conference on Supercomputing
The design and evaluation of high performance communication using a Gigabit Ethernet

ICS '99 Proceedings of the 13th international conference on Supercomputing
PM2: a high performance communication middleware for heterogeneous network environments

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
PM: An Operating System Coordinated High Performance Communication Library

HPCN Europe '97 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
Implementing MPI with the Memory-Based Communication Facilities on the SSS-CORE Operating System

Proceedings of the 5th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
An MPI Library which uses Polling, Interrupts and Remote Copying for the Fujitsu AP1000+

ISPAN '96 Proceedings of the 1996 International Symposium on Parallel Architectures, Algorithms and Networks
LAPACK Working Note 94: A User''s Guide to the BLACS v1.0

LAPACK Working Note 94: A User''s Guide to the BLACS v1.0

Message Passing for Linux Clusters with Gigabit Ethernet Mesh Connections

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 9 - Volume 10
Exploiting 162-Nanosecond End-to-End Communication Latency on Anton

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis

Quantified Score

Hi-index	0.01

Visualization

Abstract

A fast message-passing library FMPL has been designed and developed to maximize communication performance by utilizing general architectural communication support such as remote memory operations, as well as to maximize total performance by eliminating dynamic communication overhead and overlapping communication and computation. FMPL provides a low-cost general-purpose point-to-point communication and collective communication such as broadcast, barrier synchronization and reduction. On a Hitachi SR8000, FMPL achieves an 8-byte latency of 12.8μsec., while MPI achieves 20μsec. FMPL is designed for building more highly functional message-passing libraries like BLACS as well as applications that need maximum performance.