Yada: Straightforward parallel programming

  • Authors:
  • David Gay;Joel Galenson;Mayur Naik;Kathy Yelick

  • Affiliations:
  • Intel Labs Berkeley, 2150 Shattuck Ave Ste 1300, Berkeley, CA 94704, USA;EECS Department, University of California, Berkeley, CA 94720, USA;Intel Labs Berkeley, 2150 Shattuck Ave Ste 1300, Berkeley, CA 94704, USA;EECS Department, University of California, Berkeley, CA 94720, USA

  • Venue:
  • Parallel Computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Now that multicore chips are common, providing an approach to parallel programming that is usable by regular programmers has become even more important. This cloud has one silver lining: providing useful speedup on a program is useful in and of itself, even if the resulting performance is lower than the best possible parallel performance on the same program. To help achieve this goal, Yada is an explicitly parallel programming language with sequential semantics. Explicitly parallel, because we believe that programmers need to identify how and where to exploit potential parallelism, but sequential semantics so that programmers can understand and debug their parallel programs in the way that they already know, i.e. as if they were sequential. The key new idea in Yada is the provision of a set of types that support parallel operations while still preserving sequential semantics. Beyond the natural read-sharing found in most previous sequential-like languages, Yada supports three other kinds of sharing. Writeonce locations support a single write and multiple reads, and two kinds of sharing for locations updated with an associative operator generalise the reduction and parallel-prefix operations found in many data-parallel languages. We expect to support other kinds of sharing in the future. We have evaluated our Yada prototype on eight algorithms and four applications, and found that programs require only a few changes to get useful speedups ranging from 2.2 to 6.3 on an 8-core machine. Yada performance is mostly comparable to parallel implementations of the same programs using OpenMP or explicit threads.