Compiler optimization techniques for OpenMP programs

Authors:
Shigehisa Satoh;Kazuhiro Kusano;Mitsuhisa Sato
Affiliations:
Tsukuba Res. Ctr., Real World Comp. Partnership, Japan. sh-sato@trc.rwcp.or.jp. Sys. Dev. Lab., Hitachi, Ltd., Japan (3-16-8-402 Fujimi-Cho, Chofu-shi, Tokyo 182-0033, Japan. Tel.: +81 424 41 4058 ...;1st Computers Software Division, NEC Solutions, NEC Corporation, 1-10 Nissin-cho, Fuchu, Tokyo 183-8501, Japan;Center for Computational Physics, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8577, Japan
Venue:
Scientific Programming
Year:
2001

Citing 24
Cited 11

Compile-time analysis of parallel programs that share memory

POPL '92 Proceedings of the 19th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Data flow equations for explicitly parallel programs

PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Static single assignment for explicitly parallel programs

POPL '93 Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Optimizing parallel programs with explicit synchronization

PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Compiler optimizations for eliminating barrier synchronization

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
A compiler-directed distributed shared memory system

ICS '95 Proceedings of the 9th international conference on Supercomputing
Iteration space slicing and its application to communication optimization

ICS '97 Proceedings of the 11th international conference on Supercomputing
Basic compiler algorithms for parallel programs

Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Code motion for explicitly parallel programs

Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Transparent adaptive parallelism on NOWs using OpenMP

Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
High Performance Compilers for Parallel Computing

High Performance Compilers for Parallel Computing
COMPaS: A PC-Based SMP Cluster

IEEE Concurrency
Concurrent SSA Form in the Presence of Mutual Exclusion

ICPP '98 Proceedings of the 1998 International Conference on Parallel Processing
Supporting Software Distributed Shared Memory with an Optimizing Compiler

ICPP '98 Proceedings of the 1998 International Conference on Parallel Processing
OpenMP for Networks of SMPs

IPPS '99/SPDP '99 Proceedings of the 13th International Symposium on Parallel Processing and the 10th Symposium on Parallel and Distributed Processing
Computing Communication Sets for Control Parallel Programs

LCPC '94 Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing
Impact of OpenMP Optimizations for the MGCG Method

ISHPC '00 Proceedings of the Third International Symposium on High Performance Computing
Performance Evaluation of the Omni OpenMP Compiler

ISHPC '00 Proceedings of the Third International Symposium on High Performance Computing
Parallel Data-Flow Analysis of Explicitly Parallel Programs

Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Array SSA for Explicitly Parallel Programs

Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Fine-Grain Software Distributed Shared Memory on SMP Clusters

HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
Sirocco: Cost-Effective Fine-Grain Distributed Shared Memory

PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
Compiler Support for Data Forwarding in Scalable Shared-Memory Multiprocessors

ICPP '99 Proceedings of the 1999 International Conference on Parallel Processing
Compile-time Synchronization Optimizations for Software DSMs

IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium

Detailed cache coherence characterization for OpenMP benchmarks

Proceedings of the 18th annual international conference on Supercomputing
A hybrid hardware/software approach to efficiently determine cache coherence Bottlenecks

Proceedings of the 19th annual international conference on Supercomputing
Analysis of cache-coherence bottlenecks with hybrid hardware/software techniques

ACM Transactions on Architecture and Code Optimization (TACO)
Source-Code-Correlated Cache Coherence Characterization of OpenMP Benchmarks

IEEE Transactions on Parallel and Distributed Systems
Improving the performance of OpenMP by array privatization

WOMPAT'03 Proceedings of the OpenMP applications and tools 2003 international conference on OpenMP shared memory parallel programming
Efficient OpenMP support and extensions for MPSoCs with explicitly managed memory hierarchy

Proceedings of the Conference on Design, Automation and Test in Europe
Static nonconcurrency analysis of OpenMP programs

IWOMP'05/IWOMP'06 Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming
A technique for the effective and automatic reuse of classical compiler optimizations on multithreaded code

Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
ompVerify: polyhedral analysis for the OpenMP programmer

IWOMP'11 Proceedings of the 7th international conference on OpenMP in the Petascale era
On a Technique for Transparently Empowering Classical Compiler Optimizations on Multithreaded Code

ACM Transactions on Programming Languages and Systems (TOPLAS)
Online feedback-directed optimizations for parallel Java code

Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages & applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

We have developed compiler optimization techniques for explicit parallel programs using the OpenMP API. To enable optimization across threads, we designed dataflow analysis techniques in which interactions between threads are effectively modeled. Structured description of parallelism and relaxed memory consistency in OpenMP make the analyses effective and efficient. We developed algorithms for reaching definitions analysis, memory synchronization analysis, and cross-loop data dependence analysis for parallel loops. Our primary target is compiler-directed software distributed shared memory systems in which aggressive compiler optimizations for software-implemented coherence schemes are crucial to obtaining good performance. We also developed optimizations applicable to general OpenMP implementations, namely redundant barrier removal and privatization of dynamically allocated objects. Experimental results for the coherency optimization show that aggressive compiler optimizations are quite effective for a shared-write intensive program because the coherence-induced communication volume in such a program is much larger than that in shared-read intensive programs.