X10: concurrent programming for modern architectures

Authors:
Vijay A. Saraswat;Vivek Sarkar;Christoph von Praun
Affiliations:
IBM TJ Watson Research Center, Hawthorne, NY;IBM TJ Watson Research Center, Hawthorne, NY;IBM TJ Watson Research Center, Yorktown Heights, NY
Venue:
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Year:
2007

Citing 0
Cited 23

Minimal Ownership for Active Objects

APLAS '08 Proceedings of the 6th Asian Symposium on Programming Languages and Systems
Compile-Time Analysis and Specialization of Clocks in Concurrent Programs

CC '09 Proceedings of the 18th International Conference on Compiler Construction: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009
Inferring Synchronization under Limited Observability

TACAS '09 Proceedings of the 15th International Conference on Tools and Algorithms for the Construction and Analysis of Systems: Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2009,
Towards a holistic approach to auto-parallelization: integrating profile-driven parallelism detection and machine-learning based mapping

Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
Programming model for a heterogeneous x86 platform

Proceedings of the 2009 ACM SIGPLAN conference on Programming language design and implementation
What the parallel-processing community has (failed) to offer the multi/many-core generation

Journal of Parallel and Distributed Computing
Fabric: a platform for secure distributed computation and storage

Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
Aliasing, confinement, and ownership in object-oriented programming: report on the workshop IWACO'07 at ECOOP 2007

ECOOP'07 Proceedings of the 2007 conference on Object-oriented technology
Partitioning streaming parallelism for multi-cores: a machine learning based approach

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Semi-automatic extraction and exploitation of hierarchical pipeline parallelism using profiling information

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
The Paralax infrastructure: automatic parallelization with a helping hand

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Gossamer: a lightweight programming framework for multicore machines

HotPar'10 Proceedings of the 2nd USENIX conference on Hot topics in parallelism
Concurrent programming with revisions and isolation types

Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Optimizing a shared virtual memory system for a heterogeneous CPU-accelerator platform

ACM SIGOPS Operating Systems Review
Kremlin: rethinking and rebooting gprof for the multicore age

Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Evaluating the performance and scalability of mapreduce applications on X10

APPT'11 Proceedings of the 9th international conference on Advanced parallel processing technologies
X10-based massive parallel large-scale traffic flow simulation

Proceedings of the 2012 ACM SIGPLAN X10 Workshop
Highly Scalable X10-Based Agent Simulation Platform and Its Application to Large-Scale Traffic Simulation

DS-RT '12 Proceedings of the 2012 IEEE/ACM 16th International Symposium on Distributed Simulation and Real Time Applications
Producer-Consumer: the programming model for future many-core processors

ARCS'13 Proceedings of the 26th international conference on Architecture of Computing Systems
Steal Tree: low-overhead tracing of work stealing schedulers

Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation
Semi-automatic restructuring of offloadable tasks for many-core accelerators

SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
The design and implementation of clocked variables in X10

ACSC '13 Proceedings of the Thirty-Sixth Australasian Computer Science Conference - Volume 135
Integrating profile-driven parallelism detection and machine-learning-based mapping

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Two major trends are converging to reshape the landscape of concurrent object-oriented programming languages. First, trends in modern architectures (multi-core, accelerators, high performance clusters such as Blue Gene) are making concurrency and distribution inescapable for large classes of OO programmers. Second, experience with first-generation concurrent OO languages (e.g. Java threads and synchronization) have revealed several drawbacks of unstructured threads with lock-based synchronization. X10 is a second generation OO language designed to address both programmer productivity and parallel performance for modern architectures. It extends sequential Java with a handful of constructs for concurrency and distribution. It introduces a clustered address space to deal with distribution. A computation is thought of as running at multiple places, with many simultaneous activities operating in each place. Objects and activities once created in a particular place stay confined to that place. However, a data-structure (object) allocated in one place may contain a reference to an object allocated in anoter place. (Thus X10 supports a partitioned global address space. X10 is an explicitly concurrent language. It provides constructs for lightweight asynchrony, making it easy for programmers to write code for target architectures that provide massive parallelism. It provides for recursive fork-join parallelism for structured concurrency. It provides for termination detection so that collections of activities may be reliably sequenced (even if they run across multiple places). It provides for a very simple form of atomic blocks in lieu of locks for mutual exclusion. These constructs can be used to define more sophisticated synchronization constructs such as futures and clocks. X10 supports a rich notion of multi-dimensional index spaces (regions), together with a rich set of operations on regions. Regions are first-class data-structures -- they can be produced dynamically, stored in data-structures, passed around in method invocations etc. A distributed version of regions (distributions) is also defined. It specifies a mapping of every point in the underlying region to a place. An array is simply a mapping from a distribution to backing store of the given type, partitioned across various places in the manner described by the distribution. Rank-generic programming is supported through generic points. Parallel iteration constructs are also provided. X10 provides a rich framework for constraint-based value-dependent types. The programmer may specify types -- such as the type of square arrays of doubles of rank 2 -- which reference run-time constant (final) values. Classes and interfaces can be parametrized with properties, which are to be thought of as final instance fields. A dependent type is merely a constraint over these properties. Types are checked statically; this requires the compiler to use a constraint-solver. The design of the type-system and the implementation is modular so that a new constraint system can be defined and plugged into the language in a fairly routine fashion. Dynamic casts are also provided -- this permits an object to be checked at runtime for conformance to a dependent type. The compiler takes care of generating run-time code for performing such tests. The tutorial illustrate how common design patterns for concurrency and distribution can be naturally expressed in X10 (wait-free algorithms, data-flow synchronization, streaming parallelism, co-processor parallelism, hierarchical task-parallelism and phased computations). It shows design patterns for establishing that programs are determinate and/or deadlock-free. Examples are drawn from high-performance computing and middleware (transactions, event-driven computing). Participants will be encouraged to download the X10 implementation from SourceForge http://x10.sf.net. The source code for the implementation is released under the Eclipse Public Licence. The implementation consists of a translator from X10 to Java, and a multi-threaded runtime system in Java. Resulting programs may be run on any SMP that supports a Java Virtual Machine.