Compiler optimizations for improving data locality
ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Unified compilation techniques for shared and distributed address space machines
ICS '95 Proceedings of the 9th international conference on Supercomputing
Compile Time Barrier Synchronization Minimization
IEEE Transactions on Parallel and Distributed Systems
Integrating loop and data transformations for global optimization
Journal of Parallel and Distributed Computing
CODES/ISSS '10 Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Automatic extraction of multi-objective aware pipeline parallelism using genetic algorithms
Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Multi-objective aware extraction of task-level parallelism using genetic algorithms
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Automatic extraction of pipeline parallelism for embedded heterogeneous multi-core platforms
Proceedings of the 2013 International Conference on Compilers, Architectures and Synthesis for Embedded Systems
Hi-index | 0.00 |
This paper develops a new approach to compiling C programs for multiple address space, multi-processor DSPs. It integrates a novel data transformation technique that exposes the processor location of partitioned data into a parallelization strategy. When this is combined with a new address resolution mechanism, it generates efficient programs that run on multiple address spaces without using message passing. This approach is applied to the UTDSP benchmark suite and evaluated on a four processor TigerSHARC board, where it is shown to outperform existing approaches and gives an average speedup of 3.25 on the parallel benchmarks.