Evaluating support for global address space languages on the Cray X1

Authors:
Christian Bell;Wei-Yu Chen;Dan Bonachea;Katherine Yelick
Affiliations:
University of California at Berkeley;University of California at Berkeley;University of California at Berkeley;University of California at Berkeley
Venue:
Proceedings of the 18th annual international conference on Supercomputing
Year:
2004

Citing 14
Cited 9

Parallel programming in Split-C

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Global communication analysis and optimization

PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Synchronization and communication in the T3E multiprocessor

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Compiler-based prefetching for recursive data structures

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Analyses and optimizations for shared address space programs

Journal of Parallel and Distributed Computing - Special issue on compilation techniques for distributed memory systems
LogGP: incorporating long messages into the LogP model for parallel computation

Journal of Parallel and Distributed Computing
Communication optimizations for parallel C programs

Journal of Parallel and Distributed Computing - Special issue on compilation and architectural support for parallel applications
UPC performance and potential: a NPB experimental study

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
A performance analysis of the Berkeley UPC compiler

ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Managing Concurrent Access for Shared Memory Active Messages

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Titanium Language Reference Manual

Titanium Language Reference Manual
GASNet Specification, v1.1

GASNet Specification, v1.1
Early Evaluation of the Cray X1

Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Evaluation of Cache-based Superscalar and Cacheless Vector Architectures for Scientific Computations

Proceedings of the 2003 ACM/IEEE conference on Supercomputing

Facilitating the search for compositions of program transformations

Proceedings of the 19th annual international conference on Supercomputing
Shared memory programming for large scale machines

Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
Semi-automatic composition of loop transformations for deep parallelism and memory hierarchies

International Journal of Parallel Programming
Coprocessor design to support MPI primitives in configurable multiprocessors

Integration, the VLSI Journal
Parallel Languages and Compilers: Perspective From the Titanium Experience

International Journal of High Performance Computing Applications
The Cray BlackWidow: a highly scalable vector multiprocessor

Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Bsp2omp: A Compiler For Translating Bsp Programs To Openmp

International Journal of Parallel, Emergent and Distributed Systems - Advances in Parallel and Distributed Computational Models
Optimizing bandwidth limited problems using one-sided communication and overlap

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
GA-GPU: extending a library-based global address spaceprogramming model for scalable heterogeneouscomputing systems

Proceedings of the 9th conference on Computing Frontiers

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Cray X1 was recently introduced as the first in a new line of parallel systems to combine high-bandwidth vector processing with an MPP system architecture. Alongside capabilities such as automatic fine-grained data parallelism through the use of vector instructions, the X1 offers hardware support for a transparent global-address space (GAS), which makes it an interesting target for GAS languages. In this paper, we describe our experience with developing a portable, open-source and high performance compiler for Unified Parallel C (UPC), a SPMD global-address space language extension of ISO C. As part of our implementation effort, we evaluate the X1's hardware support for GAS languages and provide empirical performance characterizations in the context of leveraging features such as vectorization and global pointers for the Berkeley UPC compiler. We discuss several difficulties encountered in the Cray C compiler which are likely to present challenges for many users, especially implementors of libraries and source-to-source translators. Finally, we analyze the performance of our compiler on some benchmark programs and show that, while there are some limitations of the current compilation approach, the Berkeley UPC compiler uses the X1 network more effectively than MPI or SHMEM, and generates serial code whose vectorizability is comparable to the original C code.