Profiling Dependence Vectors for Loop Parallelization
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
A loop transformation using two parallel region partitioning method
APPT'05 Proceedings of the 6th international conference on Advanced Parallel Processing Technologies
Improving parallelism of nested loops with non-uniform dependences
NPC'05 Proceedings of the 2005 IFIP international conference on Network and Parallel Computing
ICESS'04 Proceedings of the First international conference on Embedded Software and Systems
A combined technique of non-uniform loops
GPC'06 Proceedings of the First international conference on Advances in Grid and Pervasive Computing
Hi-index | 0.00 |
Several compile time transformations of loops with simple dependencies have been developed in order to expose possible parallelism in these loops. However, once an irregular data dependence is detected, no attempt is usually made to extract any parallel thread from the loop. In this paper, we present the parallel region execution, a new compile time approach for improving the execution of loops with complex dependencies. It consists of dividing the iteration space of the loop into parallel regions and serial regions, where all the iterations in the parallel regions can be fully executed in parallel. Our parallel region execution technique has been tested on the MasPar machine for various examples and generally resulted in a large speedup.