X10: concurrent programming for modern architectures

  • Authors:
  • Vijay A. Saraswat;Vivek Sarkar;Christoph von Praun

  • Affiliations:
  • IBM TJ Watson Research Center, Hawthorne, NY;IBM TJ Watson Research Center, Hawthorne, NY;IBM TJ Watson Research Center, Yorktown Heights, NY

  • Venue:
  • Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Two major trends are converging to reshape the landscape of concurrent object-oriented programming languages. First, trends in modern architectures (multi-core, accelerators, high performance clusters such as Blue Gene) are making concurrency and distribution inescapable for large classes of OO programmers. Second, experience with first-generation concurrent OO languages (e.g. Java threads and synchronization) have revealed several drawbacks of unstructured threads with lock-based synchronization. X10 is a second generation OO language designed to address both programmer productivity and parallel performance for modern architectures. It extends sequential Java with a handful of constructs for concurrency and distribution. It introduces a clustered address space to deal with distribution. A computation is thought of as running at multiple places, with many simultaneous activities operating in each place. Objects and activities once created in a particular place stay confined to that place. However, a data-structure (object) allocated in one place may contain a reference to an object allocated in anoter place. (Thus X10 supports a partitioned global address space. X10 is an explicitly concurrent language. It provides constructs for lightweight asynchrony, making it easy for programmers to write code for target architectures that provide massive parallelism. It provides for recursive fork-join parallelism for structured concurrency. It provides for termination detection so that collections of activities may be reliably sequenced (even if they run across multiple places). It provides for a very simple form of atomic blocks in lieu of locks for mutual exclusion. These constructs can be used to define more sophisticated synchronization constructs such as futures and clocks. X10 supports a rich notion of multi-dimensional index spaces (regions), together with a rich set of operations on regions. Regions are first-class data-structures -- they can be produced dynamically, stored in data-structures, passed around in method invocations etc. A distributed version of regions (distributions) is also defined. It specifies a mapping of every point in the underlying region to a place. An array is simply a mapping from a distribution to backing store of the given type, partitioned across various places in the manner described by the distribution. Rank-generic programming is supported through generic points. Parallel iteration constructs are also provided. X10 provides a rich framework for constraint-based value-dependent types. The programmer may specify types -- such as the type of square arrays of doubles of rank 2 -- which reference run-time constant (final) values. Classes and interfaces can be parametrized with properties, which are to be thought of as final instance fields. A dependent type is merely a constraint over these properties. Types are checked statically; this requires the compiler to use a constraint-solver. The design of the type-system and the implementation is modular so that a new constraint system can be defined and plugged into the language in a fairly routine fashion. Dynamic casts are also provided -- this permits an object to be checked at runtime for conformance to a dependent type. The compiler takes care of generating run-time code for performing such tests. The tutorial illustrate how common design patterns for concurrency and distribution can be naturally expressed in X10 (wait-free algorithms, data-flow synchronization, streaming parallelism, co-processor parallelism, hierarchical task-parallelism and phased computations). It shows design patterns for establishing that programs are determinate and/or deadlock-free. Examples are drawn from high-performance computing and middleware (transactions, event-driven computing). Participants will be encouraged to download the X10 implementation from SourceForge http://x10.sf.net. The source code for the implementation is released under the Eclipse Public Licence. The implementation consists of a translator from X10 to Java, and a multi-threaded runtime system in Java. Resulting programs may be run on any SMP that supports a Java Virtual Machine.