An Approach To Data Distributions in Chapel

  • Authors:
  • R.E. Diaconescu;H.P. Zima

  • Affiliations:
  • CACR, CALIFORNIA INSTITUTE OF TECHNOLOGY, PASADENA,CALIFORNIA 91125;JET PROPULSION LABORATORY, CALIFORNIA INSTITUTE OF TECHNOLOGY,PASADENA, CALIFORNIA 91109 (

  • Venue:
  • International Journal of High Performance Computing Applications
  • Year:
  • 2007

Quantified Score

Hi-index 0.03

Visualization

Abstract

A key characteristic of today's high performance computing systems is a physically distributed memory, which makes the efficient management of locality essential for taking advantage of the performance enhancements offered by these architectures. Currently, the standard technique for programming such systems involves the extension of traditional sequential programming languages with explicit message-passing libraries, in a processor-centric model for programming and execution. It is commonly understood that this programming paradigm results in complex, brittle, and error-prone programs, because of the way in which algorithms and communication are inextricably interwoven. This paper describes a new approach to locality awareness, which focuses on data distributions in high-productivity languages. Data distributions provide an abstract specification of the partitioning of large-scale data collections across memory units, supporting coarse-grain parallel computation and locality of access at a high level of abstraction. Our design, which is based on a new programming language called Chapel, is motivated by the need to provide a high-productivity paradigm for the development of efficient and reusable parallel code. We present an object-oriented framework that allows the explicit specification of the mapping of elements in a collection to memory units, the control of the arrangement of elements within such units, the definition of sequential and parallel iteration over collections, and the formulation of specialized allocation policies as required for advanced applications. The result is a concise high-productivity programming model that separates algorithms from data representation and enables reuse of distributions, allocation policies, and data structures.