Toward automatic partitioning of arrays on distributed memory computers
ICS '93 Proceedings of the 7th international conference on Supercomputing
Static and Dynamic Evaluation of Data Dependence Analysis Techniques
IEEE Transactions on Parallel and Distributed Systems
Efficient Algorithms for Data Distribution on Distributed Memory Parallel Computers
IEEE Transactions on Parallel and Distributed Systems
Compiling for distributed memory multiprocessors based on access region analysis
Compiling for distributed memory multiprocessors based on access region analysis
Maximizing parallelism and minimizing synchronization with affine partitions
Parallel Computing - Special issues on languages and compilers for parallel computers
Contention-free communication scheduling for array redistribution
Parallel Computing
A Framework for Efficient Data Redistribution on Distributed Memory Multicomputers
The Journal of Supercomputing
Solving Alignment Using Elementary Linear Algebra
LCPC '94 Proceedings of the 7th International Workshop on Languages and Compilers for Parallel Computing
Symbolic Communication Set Generation for Irregular Parallel Applications
The Journal of Supercomputing
IEEE Transactions on Parallel and Distributed Systems
Hi-index | 0.00 |
For array references with induction variables, after induction variable substitution for those induction variables is performed, those array references substituted are transformed as nonlinear expressions. The goal of data alignment is to intelligently map computations and data onto a set of virtual processors organized as a Cartesian grid with multi-dimensions (or a template in HPF term), and to provide data locality in a program so that the data access communication costs can be minimized. Most data alignment methods are mainly devised to align the arrays referenced using linear subscripts or quadratic subscripts with n loop index variables [Chang, 2004]. In this paper, we propose a new communication-free data alignment technique to align the arrays referenced using exponential subscripts with n loop index variables or other complex nonlinear expressions. The experimental results from our techniques on SPEC95FP Benchmarks point out that the techniques can be applied to improve the execution time of the subroutines in those benchmarks.