Commutative set: a language extension for implicit parallel programming

Authors:
Prakash Prabhu;Soumyadeep Ghosh;Yun Zhang;Nick P. Johnson;David I. August
Affiliations:
Princeton University, Princeton, NJ, USA;Princeton University, Princeton, NJ, USA;Princeton University, Princeton, NJ, USA;Princeton University, Princeton, NJ, USA;Princeton University, Princeton, NJ, USA
Venue:
Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation
Year:
2011

Citing 29
Cited 7

The program dependence graph and its use in optimization

ACM Transactions on Programming Languages and Systems (TOPLAS)
The design, implementation and evaluation of Jade: a portable, implicitly parallel programming language

The design, implementation and evaluation of Jade: a portable, implicitly parallel programming language
Commutativity analysis: a new analysis framework for parallelizing compilers

PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
A provable time and space efficient implementation of NESL

Proceedings of the first ACM SIGPLAN international conference on Functional programming
Olden: parallelizing programs with dynamic data structures on distributed-memory machines

Olden: parallelizing programs with dynamic data structures on distributed-memory machines
Programming with POSIX threads

Programming with POSIX threads
Optimizing compilers for modern architectures: a dependence-based approach

Optimizing compilers for modern architectures: a dependence-based approach
NetBench: a benchmarking suite for network processors

Proceedings of the 2001 IEEE/ACM international conference on Computer-aided design
Beyond Arrays - A Container-Centric Approach for Parallelization of Real-World Symbolic Applications

LCPC '98 Proceedings of the 11th International Workshop on Languages and Compilers for Parallel Computing
Experience in the Automatic Parallelization of Four Perfect-Benchmark Programs

Proceedings of the Fourth International Workshop on Languages and Compilers for Parallel Computing
LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Automatic Thread Extraction with Decoupled Software Pipelining

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
SPEC CPU2006 benchmark descriptions

ACM SIGARCH Computer Architecture News
Transactional collection classes

Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Implicit parallelism with ordered transactions

Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Optimistic parallelism requires abstractions

Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Implicitly parallel programming models for thousand-core microprocessors

Proceedings of the 44th annual Design Automation Conference
Feedback directed implicit parallelism

ICFP '07 Proceedings of the 12th ACM SIGPLAN international conference on Functional programming
Revisiting the Sequential Programming Model for Multi-Core

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Parallel-stage decoupled software pipelining

Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
Kicking the tires of software transactional memory: why the going gets tough

Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
Commutativity analysis for software parallelization: letting program transformations see the big picture

Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
The Design of OpenMP Tasks

IEEE Transactions on Parallel and Distributed Systems
Global instruction scheduling for multi-threaded architectures

Global instruction scheduling for multi-threaded architectures
The velocity compiler: extracting efficient multicore execution from legacy sequential codes

The velocity compiler: extracting efficient multicore execution from legacy sequential codes
A type and effect system for deterministic parallel Java

Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications
Coarse-grained transactions

Proceedings of the 37th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
The Paralax infrastructure: automatic parallelization with a helping hand

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Deadlock-Free channels and locks

ESOP'10 Proceedings of the 19th European conference on Programming Languages and Systems

Specification-based sketching with Sketch

Proceedings of the 13th Workshop on Formal Techniques for Java-Like Programs
Internally deterministic parallel algorithms can be fast

Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
Parcae: a system for flexible parallel execution

Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
From sequential programming to flexible parallel execution

Proceedings of the 2012 international conference on Compilers, architectures and synthesis for embedded systems
General data structure expansion for multi-threading

Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation
Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles

ACM SIGOPS 24th Symposium on Operating Systems Principles
The scalable commutativity rule: designing scalable software for multicore processors

Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sequential programming models express a total program order, of which a partial order must be respected. This inhibits parallelizing tools from extracting scalable performance. Programmer written semantic commutativity assertions provide a natural way of relaxing this partial order, thereby exposing parallelism implicitly in a program. Existing implicit parallel programming models based on semantic commutativity either require additional programming extensions, or have limited expressiveness. This paper presents a generalized semantic commutativity based programming extension, called Commutative Set (COMMSET), and associated compiler technology that enables multiple forms of parallelism. COMMSET expressions are syntactically succinct and enable the programmer to specify commutativity relations between groups of arbitrary structured code blocks. Using only this construct, serializing constraints that inhibit parallelization can be relaxed, independent of any particular parallelization strategy or concurrency control mechanism. COMMSET enables well performing parallelizations in cases where they were inapplicable or non-performing before. By extending eight sequential programs with only 8 annotations per program on average, COMMSET and the associated compiler technology produced a geomean speedup of 5.7x on eight cores compared to 1.5x for the best non-COMMSET parallelization.