MagPIe: MPI's collective communication operations for clustered wide area systems

Authors:
Thilo Kielmann;Rutger F. H. Hofman;Henri E. Bal;Aske Plaat;Raoul A. F. Bhoedjang
Affiliations:
Department of Mathematics and Computer Science, Vrije Universiteit, Amsterdam, The Netherlands;Department of Mathematics and Computer Science, Vrije Universiteit, Amsterdam, The Netherlands;Department of Mathematics and Computer Science, Vrije Universiteit, Amsterdam, The Netherlands;Department of Mathematics and Computer Science, Vrije Universiteit, Amsterdam, The Netherlands;Department of Mathematics and Computer Science, Vrije Universiteit, Amsterdam, The Netherlands
Venue:
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Year:
1999

Citing 30
Cited 105

LogP: towards a realistic model of parallel computation

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Optimal broadcast and summation in the LogP model

SPAA '93 Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures
LogGP: incorporating long messages into the LogP model—one step closer towards a realistic model for parallel computation

Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
SoftFLASH: analyzing the performance of clustered distributed virtual shared memory

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
The Legion vision of a worldwide virtual computer

Communications of the ACM
A high-performance, portable implementation of the MPI message passing interface standard

Parallel Computing
VM-based shared memory on low-latency, remote-memory-access networks

Proceedings of the 24th annual international symposium on Computer architecture
The SGI Origin: a ccNUMA highly scalable server

Proceedings of the 24th annual international symposium on Computer architecture
Cashmere-2L: software coherent shared memory on a clustered remote-write network

Proceedings of the sixteenth ACM symposium on Operating systems principles
Performance evaluation of the Orca shared-object system

ACM Transactions on Computer Systems (TOCS)
Flexible use of memory for replication/migration in cache-coherent DSM multiprocessors

Proceedings of the 25th annual international symposium on Computer architecture
Communicating across parallel message-passing environments

Journal of Systems Architecture: the EUROMICRO Journal - Special double issue: cluster computing
The grid: blueprint for a new computing infrastructure

The grid: blueprint for a new computing infrastructure
Wide-area implementation of the message passing interface

Parallel Computing - Special issue on applications
MPI-StarT: delivering network performance to numerical applications

SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
A grid-enabled MPI: message passing in heterogeneous distributed computing systems

SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Multi-protocol active messages on a cluster of SMP's

SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Scalable Networked Information Processing Environment (SNIPE)

SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
User-Level Network Interface Protocols

Computer
Myrinet: A Gigabit-per-Second Local Area Network

IEEE Micro
Efficient Collective Communication on Heterogeneous Networks of Workstations

ICPP '98 Proceedings of the 1998 International Conference on Parallel Processing
ECO: Efficient Collective Operations for Communication on Heterogeneous Networks

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Cross-Platform Analysis of Fast Messages for Myrinet

CANPC '98 Proceedings of the Second International Workshop on Network-Based Parallel Computing: Communication, Architecture, and Applications
Distributed Computing in a Heterogeneous Computing Environment

Proceedings of the 5th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
MPI_Connect Managing Heterogeneous MPI Applications Ineroperation and Process Control

Proceedings of the 5th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Heterogeneous MPI Application Interoperation and Process Management under PVMPI

Proceedings of the 4th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Fine-Grain Software Distributed Shared Memory on SMP Clusters

HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
Sensitivity of Parallel Applications to Large Differences in Bandwidth and Latency in Two-Layer Interconnects

HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
Forecasting network performance to support dynamic scheduling using the network weather service

HPDC '97 Proceedings of the 6th IEEE International Symposium on High Performance Distributed Computing
Optimizing Parallel Applications for Wide-Area Clusters

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium

Wire-area parallel computing in Java

JAVA '99 Proceedings of the ACM 1999 conference on Java Grande
Toward Formally-Based Design of Message Passing Programs

IEEE Transactions on Software Engineering - Special issue on architecture-independent languages and software tools for parallel processing
MPICH-GQ: quality-of-service for message passing programs

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Object-based collective communication in Java

Proceedings of the 2001 joint ACM-ISCOPE conference on Java Grande
Optimizing threaded MPI execution on SMP clusters

ICS '01 Proceedings of the 15th international conference on Supercomputing
Efficient load balancing for wide-area divide-and-conquer applications

PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
The distributed ASCI Supercomputer project

ACM SIGOPS Operating Systems Review
Challenge: integrating mobile wireless devices into the computational grid

Proceedings of the 8th annual international conference on Mobile computing and networking
Ibis: an efficient Java-based grid programming environment

JGI '02 Proceedings of the 2002 joint ACM-ISCOPE conference on Java Grande
Message passing without send-receive

Future Generation Computer Systems - Parallel computing technologies (PaCT-2001)
TOPOMON: A Monitoring Tool for Grid Network Topology

ICCS '02 Proceedings of the International Conference on Computational Science-Part II
Fast Measurement of LogP Parameters for Message Passing Platforms

IPDPS '00 Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing
Send-Recv Considered Harmful? Myths and Truths about Parallel Programming

PaCT '01 Proceedings of the 6th International Conference on Parallel Computing Technologies
The Influence of the Structure and Sizes of Jobs on the Performance of Co-allocation

IPDPS '00/JSSPP '00 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
The Influence of Communication on the Performance of Co-allocation

JSSPP '01 Revised Papers from the 7th International Workshop on Job Scheduling Strategies for Parallel Processing
Local versus Global Schedulers with Processor Co-allocation in Multicluster Systems

JSSPP '02 Revised Papers from the 8th International Workshop on Job Scheduling Strategies for Parallel Processing
Implementing MPI-2 Extended Collective Operations

Proceedings of the 6th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
MPI Optimization for SMP Based Clusters Interconnected with SCI

Proceedings of the 7th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Improved MPI All-to-all Communication on a Giganet SMP Cluster

Proceedings of the 9th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Programming environments for high-performance grid computing: the Albatross project

Future Generation Computer Systems - Grid computing: Towards a new computing infrastructure
SAT: a programming methodology with skeletons and collective operations

Patterns and skeletons for parallel and distributed computing
CC--MPI: a compiled communication capable MPI prototype for ethernet switched clusters

Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Optimizing data aggregation for cluster-based internet services

Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Priorities among Multiple Queues for Processor Co-Allocation in Multicluster Systems

ANSS '03 Proceedings of the 36th annual symposium on Simulation
Adaptive Timeout Discovery Using the Network Weather Service

HPDC '02 Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing
Send-receive considered harmful: Myths and realities of message passing

ACM Transactions on Programming Languages and Systems (TOPLAS)
Improving the execution time of global communication operations

Proceedings of the 1st conference on Computing frontiers
Efficient Multiple Multicast on Heterogeneous Network of Workstations

The Journal of Supercomputing
Load-balancing scatter operations for grid computing

Parallel Computing
MRNet: A Software-Based Multicast/Reduction Network for Scalable Tools

Proceedings of the 2003 ACM/IEEE conference on Supercomputing
A Method for MPI Broadcast in Computational Grids

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 13 - Volume 14
Performance Analysis of MPI Collective Operations

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 15 - Volume 16
Broadcasting on networks of workstations

Proceedings of the seventeenth annual ACM symposium on Parallelism in algorithms and architectures
Towards an Accurate Model for Collective Communications

International Journal of High Performance Computing Applications
Automatic generation and tuning of MPI collective communication routines

Proceedings of the 19th annual international conference on Supercomputing
An adaptive grid implementation of DNA sequence alignment

Future Generation Computer Systems
Performance Modeling and Tuning Strategies of Mixed Mode Collective Communications

SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
The Globus Striped GridFTP Framework and Server

SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
An MPI prototype for compiled communication on Ethernet switched clusters

Journal of Parallel and Distributed Computing - Special issue: Design and performance of networks for super-, cluster-, and grid-computing: Part I
Collective communication on architectures that support simultaneous communication over multiple links

Proceedings of the eleventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Optimizing the steady-state throughput of scatter and reduce operations on heterogeneous platforms

Journal of Parallel and Distributed Computing
Self-adapting numerical software (SANS) effort

IBM Journal of Research and Development
STAR-MPI: self tuned adaptive routines for MPI collective operations

Proceedings of the 20th annual international conference on Supercomputing
MPIPP: an automatic profile-guided parallel process placement toolset for SMP clusters and multiclusters

Proceedings of the 20th annual international conference on Supercomputing
Constructing large suffix trees on a computational grid

Journal of Parallel and Distributed Computing
Collective Operations for Wide-Area Message Passing Systems Using Adaptive Spanning Trees

GRID '05 Proceedings of the 6th IEEE/ACM International Workshop on Grid Computing
Performance analysis of MPI collective operations

Cluster Computing
Scheduling Policies for Processor Coallocation in Multicluster Systems

IEEE Transactions on Parallel and Distributed Systems
A fast topology inference: a building block for network-aware parallel processing

Proceedings of the 16th international symposium on High performance distributed computing
MOB: zero-configuration high-throughput multicasting for grid applications

Proceedings of the 16th international symposium on High performance distributed computing
An efficient MPI_allgather for grids

Proceedings of the 16th international symposium on High performance distributed computing
MPI collective algorithm selection and quadtree encoding

Parallel Computing
Exploitation of a parallel clustering algorithm on commodity hardware with P2P-MPI

The Journal of Supercomputing
Parallel programming over ChinaGrid

International Journal of Web and Grid Services
Broadcasting algorithm of constant complexity for fully-switched clusters

SEPADS'06 Proceedings of the 5th WSEAS International Conference on Software Engineering, Parallel and Distributed Systems
A framework for adaptive collective communications for heterogeneous hierarchical computing systems

Journal of Computer and System Sciences
A GCM-based runtime support for parallel grid applications

Proceedings of the 2008 compFrame/HPC-GECO workshop on Component based high performance
Large-Scale Parallel Computing on Grids

Electronic Notes in Theoretical Computer Science (ENTCS)
MPIWiz: subgroup reproducible replay of mpi applications

Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Bandwidth efficient all-to-all broadcast on switched clusters

International Journal of Parallel Programming
Accurate and Efficient Estimation of Parameters of Heterogeneous Communication Performance Models

International Journal of High Performance Computing Applications
Efficient high performance collective communication for the cell blade

Proceedings of the 23rd international conference on Supercomputing
High performance wide-area overlay using deadlock-free routing

Proceedings of the 18th ACM international symposium on High performance distributed computing
Hierarchical Collectives in MPICH2

Proceedings of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Process Mapping for MPI Collective Communications

Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
MPI Applications on Grids: A Topology Aware Approach

Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
An adaptive grid implementation of DNA sequence alignment

Future Generation Computer Systems
Satin: A high-level and efficient grid programming model

ACM Transactions on Programming Languages and Systems (TOPLAS)
An efficient collective communication method for grid scale networks

ICCS'03 Proceedings of the 2003 international conference on Computational science
Linear algebra computation benchmarks on a model grid platform

ICCS'03 Proceedings of the 2003 international conference on Computational science: PartII
CoMPI: configuration of collective operations in LAM/MPI using the scheme programming language

PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
A clustering model for multicast on hypercube network

GPC'08 Proceedings of the 3rd international conference on Advances in grid and pervasive computing
Dynamic Load-Balanced Multicast for Data-Intensive Applications on Clouds

CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Network offloaded hierarchical collectives using ConnectX-2's CORE-Direct capabilities

EuroMPI'10 Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface
Locality and topology aware intra-node communication among multicore CPUs

EuroMPI'10 Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface
Scheduling heuristics for efficient broadcast operations on grid environments

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Application-oriented adaptive MPI_Bcast for grids

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
The topology aware file distribution problem

COCOON'11 Proceedings of the 17th annual international conference on Computing and combinatorics
A Robust and Efficient Message Passing Library for Volunteer Computing Environments

Journal of Grid Computing
Supporting OpenMP on a multi-cluster embedded MPSoC

Microprocessors & Microsystems
Making wide-area, multi-site MPI feasible using xen VM

ISPA'06 Proceedings of the 2006 international conference on Frontiers of High Performance Computing and Networking
Analyzing fault aware collective performance in a process fault tolerant MPI

Parallel Computing
MPI collective algorithm selection and quadtree encoding

EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
Efficient allgather for regular SMP-Clusters

EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
The new multidevice architecture of MetaMPICH in the context of other approaches to grid-enabled MPI

EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
Scalable fault tolerant MPI: extending the recovery algorithm

PVM/MPI'05 Proceedings of the 12th European PVM/MPI users' group conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Dynamic interoperable message passing

PVM/MPI'05 Proceedings of the 12th European PVM/MPI users' group conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
A peer-to-peer framework for robust execution of message passing parallel programs on grids

PVM/MPI'05 Proceedings of the 12th European PVM/MPI users' group conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
An efficient collective communication method using a shortest path algorithm in a computational grid

GCC'05 Proceedings of the 4th international conference on Grid and Cooperative Computing
Dynamically adaptive binomial trees for broadcasting in heterogeneous networks of workstations

VECPAR'04 Proceedings of the 6th international conference on High Performance Computing for Computational Science
Topology-Based hypercube structures for global communication in heterogeneous networks

Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
An overview of CMPI: network performance aware MPI in the cloud

Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
Load balancing for hierarchical grid computing: a case study

HiPC'04 Proceedings of the 11th international conference on High Performance Computing
Improving multilevel approach for optimizing collective communications in computational grids

EGC'05 Proceedings of the 2005 European conference on Advances in Grid Computing
Exploiting single-assignment properties to optimize message-passing programs by code transformations

IFL'04 Proceedings of the 16th international conference on Implementation and Application of Functional Languages
A two-phase scheduling algorithm for efficient collective communications of MPICH-G2

ICDCIT'05 Proceedings of the Second international conference on Distributed Computing and Internet Technology
Performance analysis and optimization of MPI collective operations on multi-core clusters

The Journal of Supercomputing
Decision trees and MPI collective algorithm selection problem

Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
A first step towards automatically building network representations

Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Fast and efficient total exchange on two clusters

Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Open issues in MPI implementation

ACSAC'07 Proceedings of the 12th Asia-Pacific conference on Advances in Computer Systems Architecture
Optimization of collective communications in HeteroMPI

PVM/MPI'07 Proceedings of the 14th European conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
NUMA-aware shared-memory collective communication for MPI

Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
DeepDive: transparently identifying and managing performance interference in virtualized environments

USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference
The topology aware file distribution problem

Journal of Combinatorial Optimization

Quantified Score

Hi-index	0.00

Visualization

Abstract

Writing parallel applications for computational grids is a challenging task. To achieve good performance, algorithms designed for local area networks must be adapted to the differences in link speeds. An important class of algorithms are collective operations, such as broadcast and reduce. We have developed MAGPIE, a library of collective communication operations optimized for wide area systems. MAGPIE's algorithms send the minimal amount of data over the slow wide area links, and only incur a single wide area latency. Using our system, existing MPI applications can be run unmodified on geographically distributed systems. On moderate cluster sizes, using a wide area latency of 10 milliseconds and a bandwidth of 1 MByte/s, MAGPIE executes operations up to 10 times faster than MPICH, a widely used MPI implementation; application kernels improve by up to a factor of 4. Due to the structure of our algorithms, MAGPIE's advantage increases for higher wide area latencies.