Experiences with Sweep3D implementations in Co-array Fortran

Authors:
Cristian Coarfa;Yuri Dotsenko;John Mellor-Crummey
Affiliations:
Department of Computer Science, Rice University, Houston, USA 77005;Department of Computer Science, Rice University, Houston, USA 77005;Department of Computer Science, Rice University, Houston, USA 77005
Venue:
The Journal of Supercomputing
Year:
2006

Citing 8
Cited 2

Efficient implementation of a 3-dimensional ADI method on the iPSC/860

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Co-array Fortran for parallel programming

ACM SIGPLAN Fortran Forum
MPI-The Complete Reference, Volume 1: The MPI Core

MPI-The Complete Reference, Volume 1: The MPI Core
The Quadrics Network: High-Performance Clustering Technology

IEEE Micro
Protocols and Strategies for Optimizing Performance of Remote Memory Operations on Clusters

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
ARMCI: A Portable Remote Memory Copy Libray for Ditributed Array Libraries and Compiler Run-Time Systems

Proceedings of the 11 IPPS/SPDP'99 Workshops Held in Conjunction with the 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing
A Multi-Platform Co-Array Fortran Compiler

Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Experiences with co-array fortran on hardware shared memory platforms

LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing

Multithreaded Global Address Space Communication Techniques for Gyrokinetic Fusion Applications on Ultra-Scale Platforms

Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
A preliminary evaluation of the hardware acceleration of the Cray Gemini interconnect for PGAS languages and comparison with MPI

ACM SIGMETRICS Performance Evaluation Review

Quantified Score

Hi-index	0.00

Visualization

Abstract

As part of the recent focus on increasing the productivity of parallel application developers, Co-array Fortran (CAF) has emerged as an appealing alternative to the Message Passing Interface (MPI). CAF belongs to the family of global address space parallel programming languages; such languages provide the abstraction of globally addressable memory accessed using one-sided communication. At Rice University we are developing caf c, an open source, multiplatform CAF compiler. Our earlier studies show that caf c-compiled CAF programs achieve similar performance to that of corresponding MPI codes for the NAS Parallel Benchmarks. In this paper, we present a study of several CAF implementations of Sweep3D on four modern architectures. We analyze the impact of using one-sided communication in Sweep3D, identify potential sources of inefficiencies and suggest ways to address them. Our results show that we achieve comparable performance to that of the MPI version on three cluster-based architectures and outperform it by up to 10 % on the SGI Altix 3000.