Automatic translation of FORTRAN programs to vector form

Authors:
Randy Allen;Ken Kennedy
Affiliations:
Rice Univ., Houston, TX;Rice Univ., Houston, TX
Venue:
ACM Transactions on Programming Languages and Systems (TOPLAS)
Year:
1987

Citing 11
Cited 210

The art of computer programming, volume 1 (3rd ed.): fundamental algorithms

The art of computer programming, volume 1 (3rd ed.): fundamental algorithms
A Survey of Parallel Machine Organization and Programming

ACM Computing Surveys (CSUR)
The CRAY-1 computer system

Communications of the ACM - Special issue on computer architecture
An algorithm for reduction of operator strength

Communications of the ACM
Conversion of control dependence to data dependence

POPL '83 Proceedings of the 10th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Dependence graphs and compiler optimizations

POPL '81 Proceedings of the 8th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
The Design and Analysis of Computer Algorithms

The Design and Analysis of Computer Algorithms
Structure of Computers and Computations

Structure of Computers and Computations
Fortran for the Texas Instruments ASC system

Proceedings of the conference on Programming languages and compilers for parallel and vector machines
Control and data dependence for program transformations.

Control and data dependence for program transformations.
Dependence analysis for subscripted variables and its application to program transformations

Dependence analysis for subscripted variables and its application to program transformations

Compiler Optimizations for Enhancing Parallelism and Their Impact on Architecture Design

IEEE Transactions on Computers - Special issue on architectural support for programming languages and operating systems
Dependence of multi-dimensional array references

ICS '88 Proceedings of the 2nd international conference on Supercomputing
A Simplified Framework for Reduction in Strength

IEEE Transactions on Software Engineering
Parallelizing a database programming language

DPDS '88 Proceedings of the first international symposium on Databases in parallel and distributed systems
Vectorizing compilers: a test suite and results

Proceedings of the 1988 ACM/IEEE conference on Supercomputing
Compiling issues for supercomputers

Proceedings of the 1988 ACM/IEEE conference on Supercomputing
Implementing Gauss Jordan on a hypercube multicomputer

C3P Proceedings of the third conference on Hypercube concurrent computers and applications - Volume 2
Overlapped loop support in the Cydra 5

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Available instruction-level parallelism for superscalar and superpipelined machines

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Static analysis of low-level synchronization

PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
Automatic generation of DAG parallelism

PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
On optimal loop parallelization

MICRO 22 Proceedings of the 22nd annual workshop on Microprogramming and microarchitecture
GTS: parallelization and vectorization of tight recurrences

Proceedings of the 1989 ACM/IEEE conference on Supercomputing
The parascope editor: an interactive parallel programming tool

Proceedings of the 1989 ACM/IEEE conference on Supercomputing
A methodology for parallelizing programs for multicomputers and complex memory multiprocessors

Proceedings of the 1989 ACM/IEEE conference on Supercomputing
More iteration space tiling

Proceedings of the 1989 ACM/IEEE conference on Supercomputing
Profiling an Incremental Data Flow Analysis Algorithm

IEEE Transactions on Software Engineering
EVA: an explicit vector language

ACM SIGPLAN Notices
Optimizing programs over the constructive reals

PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
Compilation of Haskell array comprehensions for scientific computing

PLDI '90 Proceedings of the ACM SIGPLAN 1990 conference on Programming language design and implementation
An efficient hybrid algorithm for incremental data flow analysis

POPL '90 Proceedings of the 17th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
A foundation for sequentializing parallel code

SPAA '90 Proceedings of the second annual ACM symposium on Parallel algorithms and architectures
Analysis of event synchronization in a parallel programming tool

PPOPP '90 Proceedings of the second ACM SIGPLAN symposium on Principles & practice of parallel programming
An approach to ordering optimizing transformations

PPOPP '90 Proceedings of the second ACM SIGPLAN symposium on Principles & practice of parallel programming
Program optimization and parallelization using idioms

POPL '91 Proceedings of the 18th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Automatic transformation of series expressions into loops

ACM Transactions on Programming Languages and Systems (TOPLAS)
Vectorization and parallelization of irregular problems via graph coloring

ICS '91 Proceedings of the 5th international conference on Supercomputing
Compiler algorithms for event variable synchronization

ICS '91 Proceedings of the 5th international conference on Supercomputing
Automatic transformation of FORTRAN loops to reduce cache conflicts

ICS '91 Proceedings of the 5th international conference on Supercomputing
Optimization of array accesses by collective loop transformations

ICS '91 Proceedings of the 5th international conference on Supercomputing
Semantical interprocedural parallelization: an overview of the PIPS project

ICS '91 Proceedings of the 5th international conference on Supercomputing
Experiences with data dependence abstractions

ICS '91 Proceedings of the 5th international conference on Supercomputing
Extending the I test to direction vectors

ICS '91 Proceedings of the 5th international conference on Supercomputing
Uniform techniques for loop optimization

ICS '91 Proceedings of the 5th international conference on Supercomputing
Analysis and transformation in the ParaScope editor

ICS '91 Proceedings of the 5th international conference on Supercomputing
A unified framework for systematic loop transformations

PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Optimistic parallelization of communicating sequential processes

PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Loop distribution with arbitrary control flow

Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Generating explicit communication from shared-memory program references

Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Subdomain dependence test for massive parallelism

Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Efficient and exact data dependence analysis

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Practical dependence testing

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Conflict-Free Vector Access Using a Dynamic Storage Scheme

IEEE Transactions on Computers
Techniques for debugging parallel programs with flowback analysis

ACM Transactions on Programming Languages and Systems (TOPLAS)
The Omega test: a fast and practical integer programming algorithm for dependence analysis

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Compiler optimizations for Fortran D on MIMD distributed-memory machines

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Tiling multidimensional iteration spaces for nonshared memory machines

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Interprocedural transformations for parallel code generation

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
A semantics-directed partitioning of a processor architecture

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Four Dimensions of programming-language independence

ACM SIGPLAN Notices
Compiling Fortran D for MIMD distributed-memory machines

Communications of the ACM
A practical algorithm for exact array dependence analysis

Communications of the ACM
Delinearization: an efficient way to break multiloop dependence equations

PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Beyond induction variables

PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
A general framework for iteration-reordering loop transformations

PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Abstractions for recursive pointer data structures: improving the analysis and transformation of imperative programs

PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Evaluation of compiler optimizations for Fortran D on MIMD distributed memory machines

ICS '92 Proceedings of the 6th international conference on Supercomputing
Automatic software cache coherence through vectorization

ICS '92 Proceedings of the 6th international conference on Supercomputing
On exact data dependence analysis

ICS '92 Proceedings of the 6th international conference on Supercomputing
Optimizing for parallelism and data locality

ICS '92 Proceedings of the 6th international conference on Supercomputing
Access normalization: loop restructuring for NUMA compilers

ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Vector Register Allocation

IEEE Transactions on Computers
Compiler blockability of numerical algorithms

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Loop distribution with multiple exits

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Non-unimodular transformations of nested loops

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Abstract description of pointer data structures: an approach for improving the analysis and optimization of imperative programs

ACM Letters on Programming Languages and Systems (LOPLAS)
Global optimizations for parallelism and locality on scalable parallel machines

PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Access normalization: loop restructuring for NUMA computers

ACM Transactions on Computer Systems (TOCS)
CMAX: a Fortran translator for the connection machine system

ICS '93 Proceedings of the 7th international conference on Supercomputing
Unified compilation of Fortran 77D and 90D

ACM Letters on Programming Languages and Systems (LOPLAS)
Lazy array data-flow dependence analysis

POPL '94 Proceedings of the 21st ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Evaluating automatic parallelization for efficient execution on shared-memory multiprocessors

ICS '94 Proceedings of the 8th international conference on Supercomputing
Static analysis of upper and lower bounds on dependences and parallelism

ACM Transactions on Programming Languages and Systems (TOPLAS)
An annotated bibliography of interactive program steering

ACM SIGPLAN Notices
Parallelizing Subroutines in Sequential Programs

IEEE Software
Improving the ratio of memory operations to floating-point operations in loops

ACM Transactions on Programming Languages and Systems (TOPLAS)
Compiler transformations for high-performance computing

ACM Computing Surveys (CSUR)
Supporting dynamic data structures on distributed-memory machines

ACM Transactions on Programming Languages and Systems (TOPLAS)
Integer Programming for Array Subscript Analysis

IEEE Transactions on Parallel and Distributed Systems
A model and compilation strategy for out-of-core data parallel programs

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Extracting task-level parallelism

ACM Transactions on Programming Languages and Systems (TOPLAS)
Abstract interpretation and low-level code optimization

PEPM '95 Proceedings of the 1995 ACM SIGPLAN symposium on Partial evaluation and semantics-based program manipulation
Symbolic array dataflow analysis for array privatization and program parallelization

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Vectorization beyond data dependences

ICS '95 Proceedings of the 9th international conference on Supercomputing
Improving data locality with loop transformations

ACM Transactions on Programming Languages and Systems (TOPLAS)
Symbolic analysis for parallelizing compilers

ACM Transactions on Programming Languages and Systems (TOPLAS)
On the perfect accuracy of an approximate subscript analysis test

ICS '90 Proceedings of the 4th international conference on Supercomputing
Optimal weighted loop fusion for parallel programs

Proceedings of the ninth annual ACM symposium on Parallel algorithms and architectures
Dynamic speculation and synchronization of data dependences

Proceedings of the 24th annual international symposium on Computer architecture
Automatic selection of high-order transformations in the IBM XL FORTRAN compilers

IBM Journal of Research and Development - Special issue: performance analysis and its impact on design
Compiler blockability of dense matrix factorizations

ACM Transactions on Mathematical Software (TOMS)
A Compiler Optimization Algorithm for Shared-Memory Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
Precise miss analysis for program transformations with caches of arbitrary associativity

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
High-level semantic optimization of numerical codes

ICS '99 Proceedings of the 13th international conference on Supercomputing
Combining structural and procedural programming by parallelizing compilation

SAC '95 Proceedings of the 1995 ACM symposium on Applied computing
Constraint based vectorization

ICS '89 Proceedings of the 3rd international conference on Supercomputing
Intererence analysis tools for parallelizing programs with recursive data structures

ICS '89 Proceedings of the 3rd international conference on Supercomputing
Overview of a high-performance programmable pipeline structure

ICS '89 Proceedings of the 3rd international conference on Supercomputing
Cache miss equations: a compiler framework for analyzing and tuning memory behavior

ACM Transactions on Programming Languages and Systems (TOPLAS)
Distributed message routing and run-time support for message-passing parallel programs derived from ordinary programs

SAC '94 Proceedings of the 1994 ACM symposium on Applied computing
A case for source-level transformations in MATLAB

Proceedings of the 2nd conference on Domain-specific languages
Clustering Algorithm for Parallelizing Software Systems in Multiprocessors Environment

IEEE Transactions on Software Engineering - Special issue on architecture-independent languages and software tools parallel processing
Accurately Selecting Block Size at Runtime in Pipelined Parallel Programs

International Journal of Parallel Programming
Slicing concurrent programs

Proceedings of the 2000 ACM SIGSOFT international symposium on Software testing and analysis
Optimizing strategies for telescoping languages: procedure strength reduction and procedure vectorization

ICS '01 Proceedings of the 15th international conference on Supercomputing
Register-sensitive selection, duplication, and sequencing of instructions

ICS '01 Proceedings of the 15th international conference on Supercomputing
Loop parallelization algorithms

Compiler optimizations for scalable parallel systems
Array dataflow analysis

Compiler optimizations for scalable parallel systems
A compiler framework for mapping applications to a coarse-grained reconfigurable computer architecture

CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
Compiler Support for Scalable and Efficient Memory Systems

IEEE Transactions on Computers
Automatic data and computation decomposition on distributed memory parallel computers

ACM Transactions on Programming Languages and Systems (TOPLAS)
Speculative dynamic vectorization

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
On achieving balanced power consumption in software pipelined loops

CASES '02 Proceedings of the 2002 international conference on Compilers, architecture, and synthesis for embedded systems
Automatic intra-register vectorization for the Intel architecture

International Journal of Parallel Programming
Compatibility of Systems of Linear Constraints over the Set of Natural Numbers

Cybernetics and Systems Analysis
EXTENT: a portable programming environment for designing and implementing high-performance block recursive algorithms

Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Enabling unimodular transformations

Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Compilation Techniques for Multimedia Processors

International Journal of Parallel Programming
A Vectorizing Compiler for Multimedia Extensions

International Journal of Parallel Programming
Index Set Splitting

International Journal of Parallel Programming
Automatic Intra-Register Vectorization for the Intel® Architecture

International Journal of Parallel Programming
Reducing and Vectorizing Procedures for Telescoping Languages

International Journal of Parallel Programming
Parallelizing Programs with Recursive Data Structures

IEEE Transactions on Parallel and Distributed Systems
Interactive Parallel Programming using the ParaScope Editor

IEEE Transactions on Parallel and Distributed Systems
The I Test: An Improved Dependence Test for Automatic Parallelization and Vectorization

IEEE Transactions on Parallel and Distributed Systems
A Loop Transformation Theory and an Algorithm to Maximize Parallelism

IEEE Transactions on Parallel and Distributed Systems
Compile-Time Techniques for Data Distribution in Distributed Memory Machines

IEEE Transactions on Parallel and Distributed Systems
Automatic Extraction of Functional Parallelism from Ordinary Programs

IEEE Transactions on Parallel and Distributed Systems
Partitioning and Labeling of Loops by Unimodular Transformations

IEEE Transactions on Parallel and Distributed Systems
Synchronization and Communication Costs of Loop Partitioning on Shared-Memory Multiprocessor Systems

IEEE Transactions on Parallel and Distributed Systems
The Power Test for Data Dependence

IEEE Transactions on Parallel and Distributed Systems
The Direction Vector I Test

IEEE Transactions on Parallel and Distributed Systems
On the Efficient Engineering of Ambitious Program Analysis

IEEE Transactions on Software Engineering
The Combined Effectiveness of Unimodular Transformations, Tiling, and Software Prefetching

IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Automatic Analysis of Loops to Exploit Operator Parallelism on Reconfigurable Systems

LCPC '98 Proceedings of the 11th International Workshop on Languages and Compilers for Parallel Computing
Criteria of Satisfiability for Homogeneous Systems of Linear Diophantine Constraints

PPAM '01 Proceedings of the th International Conference on Parallel Processing and Applied Mathematics-Revised Papers
Derivation of Safety Requirements for Safety Analysis of Object-Oriented Design Documents

COMPSAC '97 Proceedings of the 21st International Computer Software and Applications Conference
Scheduling the Computations of a Loop Nest with Respect to a Given Mapping

Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Effects of Loop Fusion and Statement Migration on the Speedup of Vector Multiprocessors

PACT '94 Proceedings of the IFIP WG10.3 Working Conference on Parallel Architectures and Compilation Techniques
Advanced Scalarization of Array Syntax

CC '00 Proceedings of the 9th International Conference on Compiler Construction
Evaluating the Effectiveness of a Parallelizing Compiler

LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Automatic data mapping of signal processing applications

ASAP '97 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures and Processors
A Loop Transformation for Maximizing Parallelism from Single Loops with Nonuniform Dependencies

HPC-ASIA '97 Proceedings of the High-Performance Computing on the Information Superhighway, HPC-Asia '97
Efficient support for pipelining in software distributed shared memory systems

Real-time system security
References

Sourcebook of parallel computing
Automatic compilation to a coarse-grained reconfigurable system-opn-chip

ACM Transactions on Embedded Computing Systems (TECS)
Single Assignment C: efficient support for high-level array operations in a functional setting

Journal of Functional Programming
An experimental evaluation of scalar replacement on scientific benchmarks

Software—Practice & Experience
Automatic loop interchange

ACM SIGPLAN Notices - Best of PLDI 1979-1999
Improving register allocation for subscripted variables

ACM SIGPLAN Notices - Best of PLDI 1979-1999
Vectorization for SIMD architectures with alignment constraints

Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
An extended ANSI C for processors with a multimedia extension

International Journal of Parallel Programming
Automatic tiling of iterative stencil loops

ACM Transactions on Programming Languages and Systems (TOPLAS)
Retargeting Sequential Image-Processing Programs for Data Parallel Execution

IEEE Transactions on Software Engineering
Efficient SIMD Code Generation for Runtime Alignment and Length Conversion

Proceedings of the international symposium on Code generation and optimization
Exploitation of parallelism to nested loops with dependence cycles

Journal of Systems Architecture: the EUROMICRO Journal
An Empirical Study On the Vectorization of Multimedia Applications for Multimedia Extensions

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Optimizing inter-processor data locality on embedded chip multiprocessors

Proceedings of the 5th ACM international conference on Embedded software
Contributions to the GNU compiler collection

IBM Systems Journal
An integrated simdization framework using virtual vectors

Proceedings of the 19th annual international conference on Supercomputing
Optimizing Compiler for the CELL Processor

Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
An empirical evaluation of chains of recurrences for array dependence testing

Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Semi-automatic composition of loop transformations for deep parallelism and memory hierarchies

International Journal of Parallel Programming
Violated dependence analysis

Proceedings of the 20th annual international conference on Supercomputing
The rise and fall of High Performance Fortran: an historical object lesson

Proceedings of the third ACM SIGPLAN conference on History of programming languages
A Dimension Abstraction Approach to Vectorization in Matlab

Proceedings of the International Symposium on Code Generation and Optimization
A case for source-level transformations in MATLAB

DSL'99 Proceedings of the 2nd conference on Conference on Domain-Specific Languages - Volume 2
Algorithms for solution of systems of linear diophantine equations in residue fields

Cybernetics and Systems Analysis
NUMACROS: data parallel programming on NUMA multiprocessors

Sedms'93 USENIX Systems on USENIX Experiences with Distributed and Multiprocessor Systems - Volume 4
Optimization principles and application performance evaluation of a multithreaded GPU using CUDA

Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
Compiling for an indirect vector register architecture

Proceedings of the 5th conference on Computing frontiers
A practical automatic polyhedral parallelizer and locality optimizer

Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
A Case Study of Some Issues in the Optimization of Fortran 90 Array Notation

Scientific Programming
Outer-loop vectorization: revisited for short SIMD architectures

Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Trade-offs in loop transformations

ACM Transactions on Design Automation of Electronic Systems (TODAES)
OpenMP to GPGPU: a compiler framework for automatic translation and optimization

Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
Compiler-assisted dynamic scheduling for effective parallelization of loop nests on multicore processors

Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
An Approach for Enhancing Inter-processor Data Locality on Chip Multiprocessors

Transactions on High-Performance Embedded Architectures and Compilers I
A SIMD optimization framework for retargetable compilers

ACM Transactions on Architecture and Code Optimization (TACO)
A case study on compiler optimizations for the Intel® Core™ 2 duo processor

International Journal of Parallel Programming
Applying Data Mapping Techniques to Vector DSPs

Journal of Signal Processing Systems
A directive-based MPI code generator for Linux PC clusters

The Journal of Supercomputing
An algorithm for constructing the basis of the solution set for systems of linear Diophantine equations over the ring of integers

Cybernetics and Systems Analysis
Eclpss: a Java-based framework for parallel ecosystem simulation and modeling

Environmental Modelling & Software
The Fortran parallel transformer and its programming environment

Information Sciences: an International Journal
MacroSS: macro-SIMDization of streaming applications

Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Paper: A comparative study of automatic vectorizing compilers

Parallel Computing
Multi-processor computer system having low power consumption

PACS'02 Proceedings of the 2nd international conference on Power-aware computer systems
Simple section interchange and properties of non-computable functions

Science of Computer Programming
The Paralax infrastructure: automatic parallelization with a helping hand

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Data layout transformation for stencil computations on short-vector SIMD architectures

CC'11/ETAPS'11 Proceedings of the 20th international conference on Compiler construction: part of the joint European conferences on theory and practice of software
A platform-independent tool for modeling parallel programs

Proceedings of the 49th Annual Southeast Regional Conference
Sisal 3.2 language features overview

PaCT'11 Proceedings of the 11th international conference on Parallel computing technologies
Array replication to increase parallelism in applications mapped to configurable architectures

LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
Boosting the performance of multimedia applications using SIMD instructions

CC'05 Proceedings of the 14th international conference on Compiler Construction
Compiling high-level languages for vector architectures

LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
Automatic detection of saturation and clipping idioms

LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing
Extending a C-like language for portable SIMD programming

Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
Mapping streaming languages to general purpose processors through vectorization

LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
Unrolling loops containing task parallelism

LCPC'09 Proceedings of the 22nd international conference on Languages and Compilers for Parallel Computing
Memory space conscious loop iteration duplication for reliable execution

SAS'05 Proceedings of the 12th international conference on Static Analysis
Whole-function vectorization

CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
Scout: a source-to-source transformator for SIMD-Optimizations

Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing - Volume 2
Improving performance of OpenCL on CPUs

CC'12 Proceedings of the 21st international conference on Compiler Construction
Algorithmic species: A classification of affine loop nests for parallel programming

ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Polyhedral parallel code generation for CUDA

ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Sub-polyhedral scheduling using (unit-)two-variable-per-inequality polyhedra

POPL '13 Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Breaking SIMD shackles with an exposed flexible microarchitecture and the access execute PDG

PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Sierra: a SIMD extension for C++

Proceedings of the 2014 Workshop on Programming models for SIMD/Vector processing
Loop Transforming for Reducing Data Alignment on Multi-Core SIMD Processors

Journal of Signal Processing Systems

Quantified Score

Hi-index	0.04

Visualization

Abstract

The recent success of vector computers such as the Cray-1 and array processors such as those manufactured by Floating Point Systems has increased interest in making vector operations available to the FORTRAN programmer. The FORTRAN standards committee is currently considering a successor to FORTRAN 77, usually called FORTRAN 8x, that will permit the programmer to explicitly specify vector and array operations.Although FORTRAN 8x will make it convenient to specify explicit vector operations in new programs, it does little for existing code. In order to benefit from the power of vector hardware, existing programs will need to be rewritten in some language (presumably FORTRAN 8x) that permits the explicit specification of vector operations. One way to avoid a massive manual recoding effort is to provide a translator that discovers the parallelism implicit in a FORTRAN program and automatically rewrites that program in FORTRAN 8x.Such a translation from FORTRAN to FORTRAN 8x is not straightforward because FORTRAN DO loops are not always semantically equivalent to the corresponding FORTRAN 8x parallel operation. The semantic difference between these two constructs is precisely captured by the concept of dependence. A translation from FORTRAN to FORTRAN 8x preserves the semantics of the original program if it preserves the dependences in that program.The theoretical background is developed here for employing data dependence to convert FORTRAN programs to parallel form. Dependence is defined and characterized in terms of the conditions that give rise to it; accurate tests to determine dependence are presented; and transformations that use dependence to uncover additional parallelism are discussed.