Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
Advanced compiler optimizations for supercomputers
Communications of the ACM - Special issue on parallelism
Interprocedural dependence analysis and parallelization
SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
Direct parallelization of call statements
SIGPLAN '86 Proceedings of the 1986 SIGPLAN symposium on Compiler construction
Theory of linear and integer programming
Theory of linear and integer programming
An incremental algorithm for software analysis
SDE 2 Proceedings of the second ACM SIGSOFT/SIGPLAN software engineering symposium on Practical software development environments
Data dependence and its application to parallel processing
International Journal of Parallel Programming
ICS '88 Proceedings of the 2nd international conference on Supercomputing
Efficient interprocedural analysis for program parallelization and restructuring
PPEALS '88 Proceedings of the ACM/SIGPLAN conference on Parallel programming: experience with applications, languages and systems
A technique for summarizing data access and its use in parallelism enhancing transformations
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Structured dataflow analysis for arrays and its use in an optimizing complier
Software—Practice & Experience
Automatic recognition of induction variables and recurrence relations by abstract interpretation
PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Properties of data flow frameworks: a unified model
Acta Informatica
Semantical interprocedural parallelization: an overview of the PIPS project
ICS '91 Proceedings of the 5th international conference on Supercomputing
An experiment with inline substitution
Software—Practice & Experience
Efficient and exact data dependence analysis
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Control-flow analysis of higher-order languages of taming lambda
Control-flow analysis of higher-order languages of taming lambda
Detecting redundant accesses to array data
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Factoring: a method for scheduling parallel loops
Communications of the ACM
Sharlit—a tool for building optimizers
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Eliminating false data dependences using the Omega test
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Delinearization: an efficient way to break multiloop dependence equations
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
A safe approximate algorithm for interprocedural aliasing
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Design and evaluation of a compiler algorithm for prefetching
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
ACM Letters on Programming Languages and Systems (LOPLAS)
Array-data flow analysis and its use in array privatization
POPL '93 Proceedings of the 20th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Interprocedural analyses for programming environments
Environments and tools for parallel scientific computing
Improving locality and parallelism in nested loops
Improving locality and parallelism in nested loops
An empirical study of precise interprocedural array analysis
Scientific Programming
Compiler optimizations for eliminating barrier synchronization
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Data and computation transformations for multiprocessors
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Detecting coarse-grain parallelism using an interprocedural parallelizing compiler
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Symbolic analysis for parallelizing compilers
ACM Transactions on Programming Languages and Systems (TOPLAS)
Compiler-directed page coloring for multiprocessors
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Automatic array privatization and demand-driven symbolic analysis
Automatic array privatization and demand-driven symbolic analysis
Supercomputer performance evaluation and the Perfect Benchmarks
ICS '90 Proceedings of the 4th international conference on Supercomputing
Parallelizing compiler techniques based on linear inequalities
Parallelizing compiler techniques based on linear inequalities
Global Data Flow Analysis and Iterative Algorithms
Journal of the ACM (JACM)
A Fast and Usually Linear Algorithm for Global Flow Analysis
Journal of the ACM (JACM)
A Unified Approach to Path Problems
Journal of the ACM (JACM)
Fast Algorithms for Solving Path Problems
Journal of the ACM (JACM)
A program data flow analysis procedure
Communications of the ACM
Efficient computation of flow insensitive interprocedural summary information
SIGPLAN '84 Proceedings of the 1984 SIGPLAN symposium on Compiler construction
Dependence Analysis for Supercomputing
Dependence Analysis for Supercomputing
A precise inter-procedural data flow algorithm
POPL '81 Proceedings of the 8th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
An efficient way to find the side effects of procedure calls and the aliases of variables
POPL '79 Proceedings of the 6th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
The range test: a dependence test for symbolic, non-linear expressions
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Requirements for Data-Parallel Programming Environments
IEEE Parallel & Distributed Technology: Systems & Technology
Parallel Programming with Polaris
Computer
Multiprocessors from a Software Perspective
IEEE Micro
An Implementation of Interprocedural Bounded Regular Section Analysis
IEEE Transactions on Parallel and Distributed Systems
Performance Analysis of Parallelizing Compilers on the Perfect Benchmarks Programs
IEEE Transactions on Parallel and Distributed Systems
FIAT: A Framework for Interprocedural Analysis and Transfomation
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
Proceedings of the 6th International Workshop on Languages and Compilers for Parallel Computing
Interprocedural Array Region Analyses
LCPC '95 Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing
Interprocedural Analysis for Parallelization
LCPC '95 Proceedings of the 8th International Workshop on Languages and Compilers for Parallel Computing
Predicting the effects of optimization on a procedure body
SIGPLAN '79 Proceedings of the 1979 SIGPLAN symposium on Compiler construction
On program restructuring, scheduling, and communication for parallel processor systems
On program restructuring, scheduling, and communication for parallel processor systems
Interprocedural symbolic analysis
Interprocedural symbolic analysis
Symbolic analysis techniques for effective automatic parallelization
Symbolic analysis techniques for effective automatic parallelization
Evaluation of predicated array data-flow analysis for automatic parallelization
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Empirical optimization for a sparse linear solver: a case study
International Journal of Parallel Programming - Special issue: The next generation software program
Combining compile-time and run-time parallelization[1]
Scientific Programming
Software behavior oriented parallelization
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
Sensitivity analysis for automatic parallelization on multi-cores
Proceedings of the 21st annual international conference on Supercomputing
Automatic Discovery of Coarse-Grained Parallelism in Media Applications
Transactions on High-Performance Embedded Architectures and Compilers I
Synchronization via scheduling: managing shared state in video games
HotPar'10 Proceedings of the 2nd USENIX conference on Hot topics in parallelism
Transparent runtime parallelization of the R scripting language
Journal of Parallel and Distributed Computing
A code isolator: isolating code fragments from large programs
LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
Logical inference techniques for loop parallelization
Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Financial software on GPUs: between Haskell and Fortran
Proceedings of the 1st ACM SIGPLAN workshop on Functional high-performance computing
Loop acceleration exploration for ASIP architecture
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A T2 graph-reduction approach to fusion
Proceedings of the 2nd ACM SIGPLAN workshop on Functional high-performance computing
Non-affine Extensions to Polyhedral Code Generation
Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization
Hi-index | 0.01 |
As shared-memory multiprocessor systems become widely available, there is an increasing need for tools to simplify the task of developing parallel programs. This paper describes one such tool, the automatic parallelization system in the Stanford SUIF compiler. This article represents a culmination of a several-year research effort aimed at making parallelizing compilers significantly more effective. We have developed a system that performs full interprocedural parallelization analyses, including array privatization analysis, array reduction recognition, and a suite of scalar data-flow analyses including symbolic analysis. These analyses collaborate in an integrated fashion to exploit coarse-grain parallel loops, computationally intensive loops that can execute on multiple processors independently with no cross-processor synchronization or communication. The system has successfully parallelized large interprocedural loops over a thousand lines of code completely automatically from sequential applications.This article provides a comprehensive description of the analyses in the SUIF system. We also present extensive empirical results on four benchmark suites, showing the contribution of individual analysis techniques both in executing more of the computation in parallel, and in increasing the granularity of the parallel computations. These results demonstrate the importance of interprocedural array data-flow analysis, array privatization and array reduction recognition; a third of the programs spend more than 50&percent; of their execution time in computations that are parallelized with these techniques. Overall, these results indicate that automatic parallelization can be effective on sequential scientific computations, but only if the compiler incorporates all of these analyses.