Online feedback-directed optimizations for parallel Java code

Authors:
Albert Noll;Thomas Gross
Affiliations:
ETH Zurich, Zurich, Switzerland;ETH Zurich, Zurich, Switzerland
Venue:
Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages & applications
Year:
2013

Citing 24
Cited 0

Data flow equations for explicitly parallel programs

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Static single assignment for explicitly parallel programs

POPL '93 Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Parallelism for free: efficient and optimal bitvector analyses for parallel programs

ACM Transactions on Programming Languages and Systems (TOPLAS)
The implementation of the Cilk-5 multithreaded language

PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Advanced compiler design and implementation

Advanced compiler design and implementation
Basic compiler algorithms for parallel programs

Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Partial method compilation using dynamic profile information

OOPSLA '01 Proceedings of the 16th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
A parallel java grande benchmark suite

Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Pointer analysis for structured parallel programs

ACM Transactions on Programming Languages and Systems (TOPLAS)
Concurrent Static Single Assignment Form and Constant Propagation for Explicitly Parallel Programs

LCPC '97 Proceedings of the 10th International Workshop on Languages and Compilers for Parallel Computing
Analysis and Optimization of Explicitly Parallel Programs Using the Parallel Program Graph Representation

LCPC '97 Proceedings of the 10th International Workshop on Languages and Compilers for Parallel Computing
Compiler and Runtime Support for Running OpenMP Programs on Pentium-and Itanium-Architectures

HIPS '03 Proceedings of the Eighth International Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS'03)
The Jalapeño virtual machine

IBM Systems Journal
Java(TM) Language Specification, The (3rd Edition) (Java (Addison-Wesley))

Java(TM) Language Specification, The (3rd Edition) (Java (Addison-Wesley))
The Java memory model

Proceedings of the 32nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Compiler techniques for high performance sequentially consistent java programs

Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
X10: an object-oriented approach to non-uniform cluster computing

OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Compiler optimization techniques for OpenMP programs

Scientific Programming
How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Programs

IEEE Transactions on Computers
Interprocedural Load Elimination for Dynamic Optimization of Parallel Programs

PACT '09 Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation Techniques
The design of a task parallel library

Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications
Reducing task creation and termination overhead in explicitly parallel programs

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
A time-aware type system for data-race protection and guaranteed initialization

Proceedings of the ACM international conference on Object oriented programming systems languages and applications
A technique for the effective and automatic reuse of classical compiler optimizations on multithreaded code

Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages

Quantified Score

Hi-index	0.00

Visualization

Abstract

The performance of parallel code significantly depends on the parallel task granularity (PTG). If the PTG is too coarse, performance suffers due to load imbalance; if the PTG is too fine, performance suffers from the overhead that is induced by parallel task creation and scheduling. This paper presents a software platform that automatically determines the PTG at run-time. Automatic PTG selection is enabled by concurrent calls, which are special source language constructs that provide a late decision (at run-time) of whether concurrent calls are executed sequentially or concurrently (as a parallel task). Furthermore, the execution semantics of concurrent calls permits the runtime system to merge two (or more) concurrent calls thereby coarsening the PTG. We present an integration of concurrent calls into the Java programming language, the Java Memory Model, and show how the Java Virtual Machine can adapt the PTG based on dynamic profiling. The performance evaluation shows that our runtime system performs competitively to Java programs for which the PTG is tuned manually. Compared to an unfortunate choice of the PTG, this approach performs up to 3x faster than standard Java code.