The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
A Survey of Parallel Machine Organization and Programming
ACM Computing Surveys (CSUR)
Conversion of control dependence to data dependence
POPL '83 Proceedings of the 10th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Dependence graphs and compiler optimizations
POPL '81 Proceedings of the 8th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Fortran for the Texas Instruments ASC system
Proceedings of the conference on Programming languages and compilers for parallel and vector machines
Dependence analysis for subscripted variables and its application to program transformations
Dependence analysis for subscripted variables and its application to program transformations
Advanced compiler optimizations for supercomputers
Communications of the ACM - Special issue on parallelism
Program Translation Via Abstraction and Reimplementation
IEEE Transactions on Software Engineering
A mechanism for efficient debugging of parallel programs
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
ICS '88 Proceedings of the 2nd international conference on Supercomputing
Impact of self-scheduling order on performance on multiprocessor systems
ICS '88 Proceedings of the 2nd international conference on Supercomputing
Compiling techniques for first-order liner recurrences on a Vector computer
Proceedings of the 1988 ACM/IEEE conference on Supercomputing
Integrating noninterfering versions of programs
ACM Transactions on Programming Languages and Systems (TOPLAS)
A mechanism for efficient debugging of parallel programs
PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
Integrating non-intering versions of programs
POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
POPL '88 Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Proceedings of the 1989 ACM/IEEE conference on Supercomputing
Region Scheduling: An Approach for Detecting and Redistributing Parallelism
IEEE Transactions on Software Engineering
Improving register allocation for subscripted variables
PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Automatic transformation of FORTRAN loops to reduce cache conflicts
ICS '91 Proceedings of the 5th international conference on Supercomputing
Experiences with data dependence abstractions
ICS '91 Proceedings of the 5th international conference on Supercomputing
Scanning polyhedra with DO loops
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Subdomain dependence test for massive parallelism
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
A dynamic scheduling method for irregular parallel programs
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Processor allocation and loop scheduling on multiprocessor computers
ICS '92 Proceedings of the 6th international conference on Supercomputing
Optimizing for parallelism and data locality
ICS '92 Proceedings of the 6th international conference on Supercomputing
IEEE Transactions on Computers
Non-unimodular transformations of nested loops
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Performance evaluation for various configuration of superscalar processors
ACM SIGARCH Computer Architecture News
Orchestrating interactions among parallel computations
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Compiler transformations for high-performance computing
ACM Computing Surveys (CSUR)
Controlling application grain size on a network of workstations
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Translation of serial recursive codes to parallel SIMD codes
PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Improving data locality with loop transformations
ACM Transactions on Programming Languages and Systems (TOPLAS)
Block algorithms for sparse matrix computations on high performance workstations
ICS '96 Proceedings of the 10th international conference on Supercomputing
Software pipelining: a comparison and improvement
MICRO 23 Proceedings of the 23rd annual workshop and symposium on Microprogramming and microarchitecture
An Efficient Solution to the Cache Thrashing Problem Caused by True Data Sharing
IEEE Transactions on Computers
A Compiler Optimization Algorithm for Shared-Memory Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
SMARTS: exploiting temporal locality and parallelism through vertical execution
ICS '99 Proceedings of the 13th international conference on Supercomputing
Improving memory hierarchy performance for irregular applications
ICS '99 Proceedings of the 13th international conference on Supercomputing
A global resource-constrained parallelization technique
ICS '89 Proceedings of the 3rd international conference on Supercomputing
Optimizing memory usage in the polyhedral model
ACM Transactions on Programming Languages and Systems (TOPLAS)
Loop re-ordering and pre-fetching at run-time
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Improving the performance of DSM systems via compiler involvement
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
International Journal of Parallel Programming
Synchronization and Communication Costs of Loop Partitioning on Shared-Memory Multiprocessor Systems
IEEE Transactions on Parallel and Distributed Systems
The Power Test for Data Dependence
IEEE Transactions on Parallel and Distributed Systems
I/O Granularity Transformations
LCPC '98 Proceedings of the 11th International Workshop on Languages and Compilers for Parallel Computing
Increasing and Detecting Memory Address Congruence
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Analysis of Irregular Single-Indexed Array Accesses and Its Applications in Compiler Optimizations
CC '00 Proceedings of the 9th International Conference on Compiler Construction
Configware and morphware going mainstream
Journal of Systems Architecture: the EUROMICRO Journal - Special issue: Reconfigurable systems
Transforming Complex Loop Nests for Locality
The Journal of Supercomputing
The digital divide of computing
Proceedings of the 1st conference on Computing frontiers
Improving register allocation for subscripted variables
ACM SIGPLAN Notices - Best of PLDI 1979-1999
Modeling message-passing programs with a Performance Evaluating Virtual Parallel Machine
Performance Evaluation - Performance modelling and evaluation of high-performance parallel and distributed systems
Extracting queries by static analysis of transparent persistence
Proceedings of the 34th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Maximize Parallelism Minimize Overhead for Nested Loops via Loop Striping
Journal of VLSI Signal Processing Systems
Register-Transfer Level Transformations for Low-Power Data-Paths
Integrated Computer-Aided Engineering
Forma: A framework for safe automatic array reshaping
ACM Transactions on Programming Languages and Systems (TOPLAS)
Program optimization carving for GPU computing
Journal of Parallel and Distributed Computing
Guidance of Loop Ordering for Reduced Memory Usage in Signal Processing Applications
Journal of Signal Processing Systems
Optimizing integrated application performance with cache-aware metascheduling
OTM'11 Proceedings of the 2011th Confederated international conference on On the move to meaningful internet systems - Volume Part II
Loop striping: maximize parallelism for nested loops
EUC'06 Proceedings of the 2006 international conference on Embedded and Ubiquitous Computing
Optimizing database-backed applications with query synthesis
Proceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation
Hi-index | 0.01 |
Parallel and vector machines are becoming increasingly important to many computation intensive applications. Effectively utilizing such architectures, particularly from sequential languages such as Fortran, has demanded increasingly sophisticated compilers. In general, a compiler needs to significantly reorder a program in order to generate code optimal for a specific architecture.Because DO loops typically control the execution of a number of statements, the order in which loops are executed can dramatically affect the performance of a machine on a particular section of code. In particular, loop interchange can often be used to enhance the performance of code on parallel or vector machines.Determining when loops may be safely and profitably interchanged requires a study of the data dependences in the program. This work discusses specific applications of that theory to loop interchange. This theory is described as it has been implemented in PFC (Parallel Fortran Converter) -- a program which attempts to uncover operations in sequential Fortran code that may be safely rewritten as vector operations.