Compile-time analysis of parallel programs that share memory
POPL '92 Proceedings of the 19th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Data flow equations for explicitly parallel programs
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Static single assignment for explicitly parallel programs
POPL '93 Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Optimizing parallel programs with explicit synchronization
PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Compiler optimizations for eliminating barrier synchronization
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
A compiler-directed distributed shared memory system
ICS '95 Proceedings of the 9th international conference on Supercomputing
Iteration space slicing and its application to communication optimization
ICS '97 Proceedings of the 11th international conference on Supercomputing
Basic compiler algorithms for parallel programs
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Code motion for explicitly parallel programs
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Transparent adaptive parallelism on NOWs using OpenMP
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
High Performance Compilers for Parallel Computing
High Performance Compilers for Parallel Computing
COMPaS: A PC-Based SMP Cluster
IEEE Concurrency
Concurrent SSA Form in the Presence of Mutual Exclusion
ICPP '98 Proceedings of the 1998 International Conference on Parallel Processing
Supporting Software Distributed Shared Memory with an Optimizing Compiler
ICPP '98 Proceedings of the 1998 International Conference on Parallel Processing
IPPS '99/SPDP '99 Proceedings of the 13th International Symposium on Parallel Processing and the 10th Symposium on Parallel and Distributed Processing
Computing Communication Sets for Control Parallel Programs
LCPC '94 Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing
Impact of OpenMP Optimizations for the MGCG Method
ISHPC '00 Proceedings of the Third International Symposium on High Performance Computing
Performance Evaluation of the Omni OpenMP Compiler
ISHPC '00 Proceedings of the Third International Symposium on High Performance Computing
Parallel Data-Flow Analysis of Explicitly Parallel Programs
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Array SSA for Explicitly Parallel Programs
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Fine-Grain Software Distributed Shared Memory on SMP Clusters
HPCA '98 Proceedings of the 4th International Symposium on High-Performance Computer Architecture
Sirocco: Cost-Effective Fine-Grain Distributed Shared Memory
PACT '98 Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques
Compiler Support for Data Forwarding in Scalable Shared-Memory Multiprocessors
ICPP '99 Proceedings of the 1999 International Conference on Parallel Processing
Compile-time Synchronization Optimizations for Software DSMs
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Detailed cache coherence characterization for OpenMP benchmarks
Proceedings of the 18th annual international conference on Supercomputing
A hybrid hardware/software approach to efficiently determine cache coherence Bottlenecks
Proceedings of the 19th annual international conference on Supercomputing
Analysis of cache-coherence bottlenecks with hybrid hardware/software techniques
ACM Transactions on Architecture and Code Optimization (TACO)
Source-Code-Correlated Cache Coherence Characterization of OpenMP Benchmarks
IEEE Transactions on Parallel and Distributed Systems
Improving the performance of OpenMP by array privatization
WOMPAT'03 Proceedings of the OpenMP applications and tools 2003 international conference on OpenMP shared memory parallel programming
Efficient OpenMP support and extensions for MPSoCs with explicitly managed memory hierarchy
Proceedings of the Conference on Design, Automation and Test in Europe
Static nonconcurrency analysis of OpenMP programs
IWOMP'05/IWOMP'06 Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming
Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
ompVerify: polyhedral analysis for the OpenMP programmer
IWOMP'11 Proceedings of the 7th international conference on OpenMP in the Petascale era
On a Technique for Transparently Empowering Classical Compiler Optimizations on Multithreaded Code
ACM Transactions on Programming Languages and Systems (TOPLAS)
Online feedback-directed optimizations for parallel Java code
Proceedings of the 2013 ACM SIGPLAN international conference on Object oriented programming systems languages & applications
Hi-index | 0.00 |
We have developed compiler optimization techniques for explicit parallel programs using the OpenMP API. To enable optimization across threads, we designed dataflow analysis techniques in which interactions between threads are effectively modeled. Structured description of parallelism and relaxed memory consistency in OpenMP make the analyses effective and efficient. We developed algorithms for reaching definitions analysis, memory synchronization analysis, and cross-loop data dependence analysis for parallel loops. Our primary target is compiler-directed software distributed shared memory systems in which aggressive compiler optimizations for software-implemented coherence schemes are crucial to obtaining good performance. We also developed optimizations applicable to general OpenMP implementations, namely redundant barrier removal and privatization of dynamically allocated objects. Experimental results for the coherency optimization show that aggressive compiler optimizations are quite effective for a shared-write intensive program because the coherence-induced communication volume in such a program is much larger than that in shared-read intensive programs.