A data locality optimizing algorithm
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Global optimizations for parallelism and locality on scalable parallel machines
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Data and computation transformations for multiprocessors
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Maximizing parallelism and minimizing synchronization with affine transforms
Proceedings of the 24th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
APL '98 Proceedings of the APL98 conference on Array processing language
High-level Language Support for User-defined Reductions
The Journal of Supercomputing
Performance Evaluation of the Omni OpenMP Compiler
ISHPC '00 Proceedings of the Third International Symposium on High Performance Computing
The design and implementation of a parallel array operator for the arbitrary remapping of data
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Titanium Language Reference Manual
Titanium Language Reference Manual
MPI: A Message-Passing Interface Standard
MPI: A Message-Passing Interface Standard
A Multi-Platform Co-Array Fortran Compiler
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
UPC: Distributed Shared-Memory Programming
UPC: Distributed Shared-Memory Programming
High-level programming language abstractions for advanced and dynamic parallel computations
High-level programming language abstractions for advanced and dynamic parallel computations
Optimizing Compiler for the CELL Processor
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
X10: an object-oriented approach to non-uniform cluster computing
OOPSLA '05 Proceedings of the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Compilation for explicitly managed memory hierarchies
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
OpenMP extensions for generic libraries
IWOMP'08 Proceedings of the 4th international conference on OpenMP in a new era of parallelism
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
An approach for semiautomatic locality optimizations using OpenMP
PARA'10 Proceedings of the 10th international conference on Applied Parallel and Scientific Computing - Volume 2
Hi-index | 0.00 |
Tiling is widely used by compilers and programmer to optimize scientific and engineering code for better performance. Many parallel programming languages support tile/tiling directly through first-class language constructs or library routines. However, the current OpenMP programming language is tile oblivious , although it is the de facto standard for writing parallel programs on shared memory systems. In this paper, we introduce tile aware parallelization into OpenMP. We propose tile reduction , an OpenMP tile aware parallelization technique that allows reduction to be performed on multi-dimensional arrays. The paper has three contributions: (a) it is the first paper that proposes and discusses tile aware parallelization in OpenMP. We argue that, it is not only necessary but also possible to have tile aware parallelization in OpenMP; (b) the paper introduces the methods used to implement tile reduction, including the required OpenMP API extension and the associated code generation techniques; (c) we have applied tile reduction on a set of benchmarks. The experimental results show that tile reduction can make parallelization more natural and flexible. It not only can expose more parallelism in a program, but also can improve its data locality.