The fuzzy barrier: a mechanism for high speed synchronization of processors
ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
An efficient method of computing static single assignment form
POPL '89 Proceedings of the 16th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Cilk: an efficient multithreaded runtime system
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Automatic selection of high-order transformations in the IBM XL FORTRAN compilers
IBM Journal of Research and Development - Special issue: performance analysis and its impact on design
Concurrent SSA Form in the Presence of Mutual Exclusion
ICPP '98 Proceedings of the 1998 International Conference on Parallel Processing
Concurrent Static Single Assignment Form and Constant Propagation for Explicitly Parallel Programs
LCPC '97 Proceedings of the 10th International Workshop on Languages and Compilers for Parallel Computing
LCPC '97 Proceedings of the 10th International Workshop on Languages and Compilers for Parallel Computing
A Concurrent Execution Semantics for Parallel Program Graphs and Program Dependence Graphs
Proceedings of the 5th International Workshop on Languages and Compilers for Parallel Computing
Parallel Program Graphs and their Classification
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
X10: an object-oriented approach to non-uniform cluster computing
OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
May-happen-in-parallel analysis of X10 programs
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Productivity and performance using partitioned global address space languages
Proceedings of the 2007 international workshop on Parallel symbolic computation
Phasers: a unified deadlock-free construct for collective and point-to-point synchronization
Proceedings of the 22nd annual international conference on Supercomputing
Chunking parallel loops in the presence of synchronization
Proceedings of the 23rd international conference on Supercomputing
Work-first and help-first scheduling policies for async-finish task parallelism
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
Interprocedural Load Elimination for Dynamic Optimization of Parallel Programs
PACT '09 Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation Techniques
ICPP '09 Proceedings of the 2009 International Conference on Parallel Processing
Polyglot: an extensible compiler framework for Java
CC'03 Proceedings of the 12th international conference on Compiler construction
Reducing task creation and termination overhead in explicitly parallel programs
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Efficient data race detection for async-finish parallelism
RV'10 Proceedings of the First international conference on Runtime verification
Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
Communication Optimizations for Distributed-Memory X10 Programs
IPDPS '11 Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium
Habanero-Java: the new adventures of old X10
Proceedings of the 9th International Conference on Principles and Practice of Programming in Java
Hierarchical place trees: a portable abstraction for task parallelism and data movement
LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
Permission regions for race-free parallelism
RV'11 Proceedings of the Second international conference on Runtime verification
Scalable and precise dynamic datarace detection for structured parallelism
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Efficient data race detection for async-finish parallelism
Formal Methods in System Design
A Transformation Framework for Optimizing Task-Parallel Programs
ACM Transactions on Programming Languages and Systems (TOPLAS)
Hi-index | 0.00 |
An Intermediate Language (IL) specifies a program at a level of abstraction that includes precise semantics for state updates and control flow, but leaves unspecified the low-level software and hardware mechanisms that will be used to implement the semantics. Past ILs have followed the Von Neumann execution model by making sequential execution the default, and by supporting parallelism with runtime calls for lower-level mechanisms such as threads and locks. Now that the multicore trend is making parallelism the default execution model for all software, it behooves us as a community to study the fundamental requirements in parallel execution models and explore how they can be supported by first-class abstractions at the IL level. In this paper, we introduce five key requirements for Parallel Intermediate Representations (PIR): 1) Lightweight asynchronous tasks and communications, 2) Explicit locality, 3) Directed Synchronization with Dynamic Parallelism:, 4) Mutual Exclusion and Isolation with Dynamic Parallelism, and 5) Relaxed Exception semantics for Parallelism. We summarize the approach being taken in the Habanero Multicore Software Research project at Rice University to define a Parallel Intermediate Representation (PIR) to address these requirements. We discuss the basic issues of designing and implementing PIRs within the Habanero-Java (HJ) compilation framework that spans multiple levels of PIRs. By demonstrating several program optimizations developed in the HJ compilation framework, we show that this new PIR-based approach to compiler development brings robustness to the process of analyzing and optimizing parallel programs and is applicable to a wide range of task-parallelism programming models available today.