Advanced compiler optimizations for supercomputers
Communications of the ACM - Special issue on parallelism
A Scheme to Enforce Data Dependence on Large Multiprocessor Systems
IEEE Transactions on Software Engineering
Run-Time Parallelization and Scheduling of Loops
IEEE Transactions on Computers
Array privatization for parallel execution of loops
ICS '92 Proceedings of the 6th international conference on Supercomputing
Improving the performance of runtime parallelization
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
A scalable method for run-time loop parallelization
International Journal of Parallel Programming
Simplification of array access patterns for compiler optimizations
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
IEEE Transactions on Parallel and Distributed Systems
Optimizing Supercompilers for Supercomputers
Optimizing Supercompilers for Supercomputers
Parallel Programming with Polaris
Computer
Data Dependence and Data-Flow Analysis of Arrays
Proceedings of the 5th International Workshop on Languages and Compilers for Parallel Computing
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
Parallelizing while loops for multiprocessor systems
IPPS '95 Proceedings of the 9th International Symposium on Parallel Processing
Implementation Issues of Loop-Level Speculative Run-Time Parallelization
CC '99 Proceedings of the 8th International Conference on Compiler Construction, Held as Part of the European Joint Conferences on the Theory and Practice of Software, ETAPS'99
Interprocedural parallelization using memory classification analysis
Interprocedural parallelization using memory classification analysis
The R-LRPD Test: Speculative Parallelization of Partially Parallel Loops
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Parallel reductions: an application of adaptive algorithm selection
LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing
Code generation for parallel execution of a class of irregular loops on distributed memory systems
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.00 |
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops because they have complex or statically insufficiently defined access patterns. As parallelizable loops arise frequently in practice, we have introduced a novel framework for their identification: speculative parallelization. While we have previously shown that this method is inherently scalable its practical success depends on the fraction of ideal speedup that can be obtained on modest to moderately large parallel machines. Maximum parallelism can be obtained only through a minimization of the run-time overhead of the method, which in turn depends on its level of integration within a classic restructuring compiler and on its adaptation to characteristics of the parallelized application. We present several compiler and run-time techniques designed specifically for optimizing the run-time parallelization of sparse applications. We show how we minimize the run-time overhead associated with the speculative parallelization of sparse applications by using static control flow information to reduce the number of memory references that have to be collected at run-time. We then present heuristics to speculate on the type and data structures used by the program and thus reduce the memory requirements needed for tracing the sparse access patterns. We present an implementation in the Polaris infrastructure and experimental results.