Active messages: a mechanism for integrated communication and computation
ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture
The high performance Fortran handbook
The high performance Fortran handbook
Co-array Fortran for parallel programming
ACM SIGPLAN Fortran Forum
MPI-The Complete Reference, Volume 1: The MPI Core
MPI-The Complete Reference, Volume 1: The MPI Core
OpenMP: An Industry-Standard API for Shared-Memory Programming
IEEE Computational Science & Engineering
Proceedings of the 11 IPPS/SPDP'99 Workshops Held in Conjunction with the 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing
Experiences with co-array fortran on hardware shared memory platforms
LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
Effective communication coalescing for data-parallel applications
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
An evaluation of global address space languages: co-array fortran and unified parallel C
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Co-arrays in the next Fortran Standard
ACM SIGPLAN Fortran Forum
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Experiences with Sweep3D implementations in Co-array Fortran
The Journal of Supercomputing
Scalability analysis of SPMD codes using expectations
Proceedings of the 21st annual international conference on Supercomputing
Automatic nonblocking communication for partitioned global address space programs
Proceedings of the 21st annual international conference on Supercomputing
An Approach To Data Distributions in Chapel
International Journal of High Performance Computing Applications
Parallel Programmability and the Chapel Language
International Journal of High Performance Computing Applications
Performance portable optimizations for loops containing communication operations
Proceedings of the 22nd annual international conference on Supercomputing
Co-arrays in the next Fortran Standard
Scientific Programming - Fortran Programming Language and Scientific Programming: 50 Years of Mutual Growth
From FORTRAN 77 to locality-aware high productivity languages for peta-scale computing
Scientific Programming - Fortran Programming Language and Scientific Programming: 50 Years of Mutual Growth
Runtime optimization of vector operations on large scale SMP clusters
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
A Case Study in Tightly Coupled Multi-paradigm Parallel Programming
Languages and Compilers for Parallel Computing
Proceedings of the 6th ACM conference on Computing frontiers
Tile Reduction: The First Step towards Tile Aware Parallelization in OpenMP
IWOMP '09 Proceedings of the 5th International Workshop on OpenMP: Evolving OpenMP in an Age of Extreme Parallelism
Runtime address space computation for SDSM systems
LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing
Enabling a highly-scalable global address space model for petascale computing
Proceedings of the 7th ACM international conference on Computing frontiers
An extensible global address space framework with decoupled task and data abstractions
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Support for adaptivity in ARMCI using migratable objects
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
SpiceC: scalable parallelism via implicit copying and explicit commit
Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
An open-source compiler and runtime implementation for Coarray Fortran
Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
PLDS: Partitioning linked data structures for parallelism
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Using shared arrays in message-driven parallel programs
Parallel Computing
Data and computation abstractions for dynamic and irregular computations
HiPC'05 Proceedings of the 12th international conference on High Performance Computing
Experiences with co-array fortran on hardware shared memory platforms
LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
Journal of Parallel and Distributed Computing
Automatic communication coalescing for irregular computations in UPC language
CASCON '12 Proceedings of the 2012 Conference of the Center for Advanced Studies on Collaborative Research
Improving communication in PGAS environments: static and dynamic coalescing in UPC
Proceedings of the 27th international ACM conference on International conference on supercomputing
Experiences Developing the OpenUH Compiler and Runtime Infrastructure
International Journal of Parallel Programming
Hi-index | 0.00 |
Co-array Fortran (CAF)-a small set of extensions to Fotran 90-is an emerging model for scalable, global address space parallel programming.CAF's global address space programming model simplifies the development of single-program-multiple-data parallel programs by shifting the burden for managing the details of communication from developers to compilers.This paper describes cafc-a prototype implementation of an open-source, multiplication CAF compiler that generates code well-suited for today's commodity clusters.The cafc compiler translates CAF into Fortran 90 plus calls to one-sided communication primitives.The paper describes key details of cafc's approach to generating efficient code for multiple platforms.Experiments compare the performance of CAF and MPI versions of several NAS parallel benchmarks on an Alpha cluster with a Quadrics interconnect, an Itanium 2 cluster with a Myrinet 2000 interconnect and an Itanium 2 cluster with a Quadrics interconnect. These experiments show that cafc compiles CAF programs into code that delivers performance roughly equal to that of hand-optimized MPI programs.