Machine and collection abstractions for user-implemented data-parallel programming

  • Authors:
  • Magne Haveraaen

  • Affiliations:
  • Department of Informatics, University of Bergen, P.O. Box 7800, N-5020 BERGEN, Norway

  • Venue:
  • Scientific Programming
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data parallelism has appeared as a fruitful approach to the parallelisation of compute-intensive programs. Data parallelism has the advantage of mimicking the sequential (and deterministic) structure of programs as opposed to task parallelism, where the explicit interaction of processes has to be programmed. In data parallelism data structures, typically collection classes in the form of large arrays, are distributed on the processors of the target parallel machine. Trying to extract distribution aspects from conventional code often runs into problems with a lack of uniformity in the use of the data structures and in the expression of data dependency patterns within the code. Here we propose a framework with two conceptual classes, {\tt Machine} and {\tt Collection}. The {\tt Machine} class abstracts hardware communication and distribution properties. This gives a programmer high-level access to the important parts of the low-level architecture. The {\tt Machine} class may readily be used in the implementation of a {\tt Collection} class, giving the programmer full control of the parallel distribution of data, as well as allowing normal sequential implementation of this class. Any program using such a collection class will be parallelisable, without requiring any modification, by choosing between sequential and parallel versions at link time. Experiments with a commercial application, built using the Sophus library which uses this approach to parallelisation, show good parallel speed-ups, without any adaptation of the application program being needed.