A programming model for deterministic task parallelism

Authors:
Polyvios Pratikakis;Hans Vandierendonck;Spyros Lyberis;Dimitrios S. Nikolopoulos
Affiliations:
FORTH-ICS;Ghent University;FORTH-ICS, and University of Crete;FORTH-ICS, and University of Crete
Venue:
Proceedings of the 2011 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness
Year:
2011

Citing 23
Cited 4

Fast allocation and deallocation of memory based on object lifetimes

Software—Practice & Experience
The shared regions approach to software cache coherence on multiprocessors

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Transactional memory: architectural support for lock-free data structures

ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
Cilk: an efficient multithreaded runtime system

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Region-based memory management

Information and Computation
Efficient detection of determinacy races in Cilk programs

Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Region-based memory management in cyclone

PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation
A stream compiler for communication-exposed architectures

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
OpenMP: An Industry-Standard API for Shared-Memory Programming

IEEE Computational Science & Engineering
X10: an object-oriented approach to non-uniform cluster computing

OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
The Problem with Threads

Computer
Safe manual memory management in cyclone

Science of Computer Programming - Special issue on five perspectives on modern memory management: Systems, hardware and theory
Sequoia: programming the memory hierarchy

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Sequoia: programming the memory hierarchy

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Parallel Programmability and the Chapel Language

International Journal of High Performance Computing Applications
CellSs: making it easier to program the cell broadband engine processor

IBM Journal of Research and Development
Inferring locks for atomic sections

Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
Intel threading building blocks

Intel threading building blocks
DMP: deterministic shared memory multiprocessing

Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Kendo: efficient deterministic multithreading in software

Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
Safe nondeterminism in a deterministic-by-default parallel language

Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
SpiceC: scalable parallelism via implicit copying and explicit commit

Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
RCDC: a relaxed consistency deterministic computer

Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems

Analysis of recursively parallel programs

POPL '12 Proceedings of the 39th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
The myrmics memory allocator: hierarchical,message-passing allocation for global address spaces

Proceedings of the 2012 international symposium on Memory Management
Deterministic scale-free pipeline parallelism with hyperqueues

SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Analysis of Recursively Parallel Programs

ACM Transactions on Programming Languages and Systems (TOPLAS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

The currently dominant programming models to write software for multicore processors use threads that run over shared memory. However, as the core count increases, cache coherency protocols get very complex and ineffective, and maintaining a shared memory abstraction becomes expensive and impractical. Moreover, writing multithreaded programs is notoriously difficult, as the programmer needs to reason about all the possible thread interleavings and interactions, including the myriad of implicit, non-obvious, and often unpredictable thread interactions through shared memory. Overall, as processors get more cores and parallel software becomes mainstream, the shared memory model reaches its limits regarding ease of programming and efficiency. This position paper presents two ideas aiming to solve the problem. First, we restrict the way the programmer expresses parallelism: The program is a collection of possibly recursive tasks, where each task is atomic and cannot communicate with any other task during its execution. Second, we relax the requirement for coherent shared memory: Each task defines its memory footprint, and is guaranteed to have exclusive access to that memory during its execution. Using this model, we can then define a runtime system that transparently performs the data transfers required among cores without cache coherency, and also produces a deterministic execution of the program, provably equivalent to its sequential elision.