Loop Parallelization
Loop Transformations for Restructuring Compilers: The Foundations
Loop Transformations for Restructuring Compilers: The Foundations
High Performance Compilers for Parallel Computing
High Performance Compilers for Parallel Computing
Enabling unimodular transformations
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Partitioning and Labeling of Loops by Unimodular Transformations
IEEE Transactions on Parallel and Distributed Systems
Hyperplane Partitioning: An Approach to Global Data Partitioning for Distributed Memory Machines
IPPS '99/SPDP '99 Proceedings of the 13th International Symposium on Parallel Processing and the 10th Symposium on Parallel and Distributed Processing
Communication Cost Estimation and Global Data Partitioning for Distributed Memory Machines
HIPC '97 Proceedings of the Fourth International Conference on High-Performance Computing
Hi-index | 0.00 |
Today, the challenge is to exploit the parallelism available in the way of multi-core architectures by the software This could be done by re-writing the application, by exploiting the hardware capabilities or expect the compiler/software runtime tools to do the job for us With the advent of multi-core architectures ([1] [2]), this problem is becoming more and more relevant Even today, there are not many run-time tools to analyze the behavioral pattern of such performance critical applications, and to re-compile them So, techniques like OpenMP for shared memory programs are still useful in exploiting parallelism in the machine This work tries to study if the loop parallelization (both with and without applying transformations) can be a good case for running scientific programs efficiently on such multi-core architectures We have found the results to be encouraging and we strongly feel that this could lead to some good results if implemented fully in a production compiler for multi-core architectures.