Direct parallelization of call statements
SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
A technique for summarizing data access and its use in parallelism enhancing transformations
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
SMARTS: exploiting temporal locality and parallelism through vertical execution
ICS '99 Proceedings of the 13th international conference on Supercomputing
MPI versus MPI+OpenMP on IBM SP for the NAS benchmarks
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Is data distribution necessary in OpenMP?
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
An Implementation of Interprocedural Bounded Regular Section Analysis
IEEE Transactions on Parallel and Distributed Systems
User-Level Dynamic Page Migration for Multiprogrammed Shared-Memory Multiprocessors
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
International Journal of Wireless and Mobile Computing
Towards optimisation of openMP codes for synchronisation and data reuse
International Journal of High Performance Computing and Networking
A runtime implementation of OpenMP tasks
IWOMP'11 Proceedings of the 7th international conference on OpenMP in the Petascale era
Dragon: a static and dynamic tool for OpenMP
WOMPAT'04 Proceedings of the 5th international conference on OpenMP Applications and Tools: shared Memory Parallel Programming with OpenMP
Hi-index | 0.00 |
In this paper, we show the potential benefits of translating OpenMP code to low-level parallel code using a data flow execution model, instead of targeting it directly to a multi-threaded program. Our goal is to improve data locality as well as reduce synchronization overheads without introducing data distribution directives to OpenMP. We outline an API that enables us to realize this model using SMARTS (Shared Memory Asynchronous Run-Time System), describe the work of the compiler and discuss the benefits of translating OpenMP to parallel code using data flow execution model. We show experimental results based part of the Parallel Ocean Program (POP) code and Jacobi kernel code running on an SGI Origin 2000.