Direct parallelization of call statements
SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
An efficient method of computing static single assignment form
POPL '89 Proceedings of the 16th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Compact representations for control dependence
PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Run-Time Parallelization and Scheduling of Loops
IEEE Transactions on Computers
The Omega test: a fast and practical integer programming algorithm for dependence analysis
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Run-time methods for parallelizing partially parallel loops
ICS '95 Proceedings of the 9th international conference on Supercomputing
Simplification of array access patterns for compiler optimizations
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Predicated array data-flow analysis for run-time parallelization
ICS '98 Proceedings of the 12th international conference on Supercomputing
IEEE Transactions on Parallel and Distributed Systems
Compiler analysis of irregular memory accesses
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Symbolic bounds analysis of pointers, array indices, and accessed memory regions
PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Parallel Programming with Polaris
Computer
Symbolic Analysis: A Basis for Parallelization, Optimization, and Scheduling of Programs
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
Interprocedural parallelization using memory classification analysis
Interprocedural parallelization using memory classification analysis
Compile-time composition of run-time data and iteration reorderings
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Improving the computational intensity of unstructured mesh applications
Proceedings of the 19th annual international conference on Supercomputing
PEAK—a fast and effective performance tuning system via compiler optimization orchestration
ACM Transactions on Programming Languages and Systems (TOPLAS)
Handling task dependencies under strided and aliased references
Proceedings of the 24th ACM International Conference on Supercomputing
Probabilistic program analysis for parallelizing compilers
VECPAR'04 Proceedings of the 6th international conference on High Performance Computing for Computational Science
A hybrid approach of OpenMP for clusters
Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
General data structure expansion for multi-threading
Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation
Hi-index | 0.00 |
We present a novel Hybrid Analysis technology which can efficiently and seamlessly integrate all static and run-time analysis of memory references into a single framework that is capable of performing all data dependence analysis and can generate necessary information for most associated memory related optimizations. We use HA to perform automatic parallelization by extracting run-time assertions from any loop and generating appropriate run-time tests that range from a low cost scalar comparison to a full, reference by reference run-time analysis. Moreover we can order the run-time tests in increasing order of complexity (overhead) and thus risk the minimum necessary overhead. We accomplish this by both extending compile time IP analysis techniques and by incorporating speculative run-time techniques when necessary. Our solution is to bridge 'free' compile time techniques with exhaustive run-time techniques through a continuum of simple to complex solutions. We have implemented our framework in the Polaris compiler by introducing an innovative intermediate representation called RT_LMAD and a run-time library that can operate on it. Based on the experimental results obtained to date we hope to to automatically parallelize most and possibly all PERFECT codes, a significant accomplishment.