Optimizing array accesses in high productivity languages

  • Authors:
  • Mackale Joyner;Zoran Budimlić;Vivek Sarkar

  • Affiliations:
  • Rice University, Houston TX;Rice University, Houston TX;Rice University, Houston TX

  • Venue:
  • HPCC'07 Proceedings of the Third international conference on High Performance Computing and Communications
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

One of the outcomes of DARPA's HPCS program has been the creation of three new high productivity languages: Chapel, Fortress, and X10. While these languages have introduced improvements in language expressiveness and programmer productivity, several technical challenges still remain in delivering high performance with these languages. In the absence of optimization, the high-level language constructs that improve productivity can result in order-of-magnitude runtime performance degradations. This paper addresses the problem of efficient code generation for high level array accesses in the X10 language. Two aspects of high level array accesses in X10 are important for productivity but also pose significant performance challenges: the high level accesses are performed through Point objects rather than integer indices, and variables containing references to arrays are rank-independent. Our solution to the first challenge is to extend the X10 compiler with automatic inlining and scalar replacement of Point objects. Our partial solution to the second challenge is to use X10's dependent type system to enable the programmer to annotate array variable declarations with additional information for the rank and region of the variable, and to allow the compiler to generate efficient code in cases where the dependent type information is available. Although this paper focuses on high level array accesses in X10, our approach is applicable to similar constructs in other languages. Our experimental results for single-thread performance demonstrate that these compiler optimizations can enable high-level X10 array accesses with implicit ranks and Points to improve performance by up to a factor of 5.4 × over unoptimized X10 code, and to also achieve performance comparable (from 48% to 100%) to that of lower-level Java programs. These results underscore the importance of the optimization techniques presented in this paper for achieving high performance with high productivity.