Conversion of control dependence to data dependence

Authors:
J. R. Allen;Ken Kennedy;Carrie Porterfield;Joe Warren
Affiliations:
Rice University, Houston, Texas;Rice University, Houston, Texas;Rice University, Houston, Texas;Rice University, Houston, Texas
Venue:
POPL '83 Proceedings of the 10th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
Year:
1983

Citing 4
Cited 176

A Survey of Parallel Machine Organization and Programming

ACM Computing Surveys (CSUR)
On folk theorems

Communications of the ACM
Flow diagrams, turing machines and languages with only two formation rules

Communications of the ACM
Control and data dependence for program transformations.

Control and data dependence for program transformations.

A vectorizing Fortran compiler

IBM Journal of Research and Development
The program dependence graph and its use in optimization

ACM Transactions on Programming Languages and Systems (TOPLAS)
Automatic translation of FORTRAN programs to vector form

ACM Transactions on Programming Languages and Systems (TOPLAS)
Compiling C for vectorization, parallelization, and inline expansion

PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Optimal loop parallelization

PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Generating sequential code from parallel code

ICS '88 Proceedings of the 2nd international conference on Supercomputing
Automatic discovery of parallelism: a tool and an experiment (extended abstract)

PPEALS '88 Proceedings of the ACM/SIGPLAN conference on Parallel programming: experience with applications, languages and systems
Overlapped loop support in the Cydra 5

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Automatic vectorization of character string manipulation and relational operations in Pascal

Proceedings of the 1989 ACM/IEEE conference on Supercomputing
Parallelization of loops with exits on pipelined architectures

Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Loop distribution with arbitrary control flow

Proceedings of the 1990 ACM/IEEE conference on Supercomputing
A Control-Flow Normalization Algorithm and its Complexity

IEEE Transactions on Software Engineering
Register allocation for software pipelined loops

PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Effective compiler support for predicated execution using the hyperblock

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Code generation schema for modulo scheduled loops

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Enhanced modulo scheduling for loops with conditional branches

MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Vector Register Allocation

IEEE Transactions on Computers
Loop distribution with multiple exits

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
The transitive closure of control dependence: the iterated join

ACM Letters on Programming Languages and Systems (LOPLAS)
Lifetime-sensitive modulo scheduling

PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Reverse If-Conversion

PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
The role of APL and J in high-performance computation

APL '93 Proceedings of the international conference on APL
Speculative disambiguation: a compilation technique for dynamic memory disambiguation

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Guarded execution and branch prediction in dynamic ILP processors

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
Iterative modulo scheduling: an algorithm for software pipelining loops

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Minimizing register requirements under resource-constrained rate-optimal software pipelining

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
A high-performance microarchitecture with hardware-programmable functional units

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Characterizing the impact of predicated execution on branch prediction

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Compiler transformations for high-performance computing

ACM Computing Surveys (CSUR)
The LRPD test: speculative run-time parallelization of loops with privatization and reduction parallelization

PLDI '95 Proceedings of the ACM SIGPLAN 1995 conference on Programming language design and implementation
Resource-Constrained Software Pipelining

IEEE Transactions on Parallel and Distributed Systems
Using predicated execution to improve the performance of a dynamically scheduled machine with speculative execution

PACT '95 Proceedings of the IFIP WG10.3 working conference on Parallel architectures and compilation techniques
Modulo scheduling with multiple initiation intervals

Proceedings of the 28th annual international symposium on Microarchitecture
Register allocation for predicated code

Proceedings of the 28th annual international symposium on Microarchitecture
Hypernode reduction modulo scheduling

Proceedings of the 28th annual international symposium on Microarchitecture
A comparison of full and partial predicated execution support for ILP processors

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Software pipelining showdown: optimal vs. heuristic methods in a production compiler

PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Analysis techniques for predicated code

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Global predicate analysis and its application to register allocation

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Heuristics for register-constrained software pipelining

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Software pipelining loops with conditional branches

Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
A Framework for Resource-Constrained Rate-Optimal Software Pipelining

IEEE Transactions on Parallel and Distributed Systems
Compiler techniques for data synchronization in nested parallel loops

ICS '90 Proceedings of the 4th international conference on Supercomputing
Incremental dependence analysis for interactive parallelization

ICS '90 Proceedings of the 4th international conference on Supercomputing
GPMB—software pipelining branch-intensive loops

MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Techniques for extracting instruction level parallelism on MIMD architectures

MICRO 26 Proceedings of the 26th annual international symposium on Microarchitecture
Increasing memory bandwidth with wide buses: compiler, hardware and performance trade-offs

ICS '97 Proceedings of the 11th international conference on Supercomputing
A framework for balancing control flow and predication

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Tuning compiler optimizations for simultaneous multithreading

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Optimal Modulo Scheduling Through Enumeration

International Journal of Parallel Programming
Integrated predicated and speculative execution in the IMPACT EPIC architecture

Proceedings of the 25th annual international symposium on Computer architecture
Modulo Scheduling with Reduced Register Pressure

IEEE Transactions on Computers
Quantitative Evaluation of Register Pressure on Software Pipelined Loops

International Journal of Parallel Programming
Task selection for a multiscalar processor

MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
The LRPD Test: Speculative Run-Time Parallelization of Loops with Privatization and Reduction Parallelization

IEEE Transactions on Parallel and Distributed Systems
The program decision logic approach to predicated execution

ISCA '99 Proceedings of the 26th annual international symposium on Computer architecture
Control CPR: a branch height reduction optimization for EPIC architectures

Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
Exploiting conditional instructions in code generation for embedded VLIW processors

DATE '99 Proceedings of the conference on Design, automation and test in Europe
On linearizing parallel code

POPL '85 Proceedings of the 12th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
On the control dependence in the program dependence graph

CSC '88 Proceedings of the 1988 ACM sixteenth annual conference on Computer science
The Partial Reverse If-Conversion Framework for Balancing Control Flow and Predication

International Journal of Parallel Programming
Tuning Compiler Optimizations for Simultaneous Multithreading

International Journal of Parallel Programming - Special issue on the 30th annual ACM/IEEE international symposium on microarchitecture, part II
Improved spill code generation for software pipelined loops

PLDI '00 Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation
Properties of Rescheduling Size Invariance for Dynamic Rescheduling-Based VLIW Cross-Generation Compatibility

IEEE Transactions on Computers
An integrated approach to accelerate data and predicate computations in hyperblocks

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
Lifetime-Sensitive Modulo Scheduling in a Production Environment

IEEE Transactions on Computers
Time Stamp Algorithms for Runtime Parallelization of DOACROSS Loops with Dynamic Dependences

IEEE Transactions on Parallel and Distributed Systems
Clustered VLIW architecture with predicated switching

Proceedings of the 38th annual Design Automation Conference
Power-aware modulo scheduling for high-performance VLIW processors

ISLPED '01 Proceedings of the 2001 international symposium on Low power electronics and design
Automatic loop interchange

SIGPLAN '84 Proceedings of the 1984 SIGPLAN symposium on Compiler construction
Cost-Conscious Strategies to Increase Performance of Numerical Programs on Aggressive VLIW Architectures

IEEE Transactions on Computers
A comparative study of modulo scheduling techniques

ICS '02 Proceedings of the 16th international conference on Supercomputing
Optimal software pipelining of loops with control flows

ICS '02 Proceedings of the 16th international conference on Supercomputing
Using predicate path information in hardware to determine true dependences

ICS '02 Proceedings of the 16th international conference on Supercomputing
Efficient static single assignment form for predication

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
The impact of if-conversion and branch prediction on program execution on the Intel® Itanium™ processor

Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture
TimeC: A Time Constraint Language for ILP Processor Compilation

Constraints
Folklore confirmed: reducible flow graphs are exponentially larger

POPL '03 Proceedings of the 30th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
A Vectorizing Compiler for Multimedia Extensions

International Journal of Parallel Programming
Path Analysis and Renaming for Predicated Instruction Scheduling

International Journal of Parallel Programming
Control Flow Regeneration for Software Pipelined Loops with Conditions

International Journal of Parallel Programming
Synchronization and Communication Costs of Loop Partitioning on Shared-Memory Multiprocessor Systems

IEEE Transactions on Parallel and Distributed Systems
A finite state machine based format model of software pipelined loops with conditions

Progress in computer research
Just-In-Time Java? Compilation for the Itanium® Processor

Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Temporary Arrays for Distribution of Loops with Control Dependences

Euro-Par '00 Proceedings from the 6th International Euro-Par Conference on Parallel Processing
Formal Verification of Explicitly Parallel Microprocessors

CHARME '99 Proceedings of the 10th IFIP WG 10.5 Advanced Research Working Conference on Correct Hardware Design and Verification Methods
Taxonomy and Description of Policy Combination Methods

POLICY '01 Proceedings of the International Workshop on Policies for Distributed Systems and Networks
Sea Cucumber: A Synthesizing Compiler for FPGAs

FPL '02 Proceedings of the Reconfigurable Computing Is Going Mainstream, 12th International Conference on Field-Programmable Logic and Applications
Static Analysis for Guarded Code

LCR '00 Selected Papers from the 5th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Predicate-aware scheduling: a technique for reducing resource constraints

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
UltraSPARC: Compiling for Maximum Floating Point Performance

COMPCON '96 Proceedings of the 41st IEEE International Computer Conference
A hierarchical basis for reordering transformations

POPL '84 Proceedings of the 11th ACM SIGACT-SIGPLAN symposium on Principles of programming languages
The program dependence graph in a software development environment

SDE 1 Proceedings of the first ACM SIGSOFT/SIGPLAN software engineering symposium on Practical software development environments
Exploiting compiler-generated schedules for energy savings in high-performance processors

Proceedings of the 2003 international symposium on Low power electronics and design
Rule-Based Building-Block Architectures for Policy-Based Networking

Journal of Network and Systems Management
Effectiveness of cross-platform optimizations for a java just-in-time compiler

OOPSLA '03 Proceedings of the 18th annual ACM SIGPLAN conference on Object-oriented programing, systems, languages, and applications
Register allocation for optimal loop scheduling

CASCON '93 Proceedings of the 1993 conference of the Centre for Advanced Studies on Collaborative research: distributed computing - Volume 2
Register Constrained Modulo Scheduling

IEEE Transactions on Parallel and Distributed Systems
Using Hammock Graphs to Structure Programs

IEEE Transactions on Software Engineering
Single-Dimension Software Pipelining for Multi-Dimensional Loops

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Probabilistic Predicate-Aware Modulo Scheduling

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Automatic loop interchange

ACM SIGPLAN Notices - Best of PLDI 1979-1999
Time optimal software pipelining of loops with control flows

International Journal of Parallel Programming
Compiler-Directed ILP Extraction for Clustered VLIW/EPIC Machines: Predication, Speculation and Modulo Scheduling

DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Superword-Level Parallelism in the Presence of Control Flow

Proceedings of the international symposium on Code generation and optimization
The Potential of Computation Regrouping for Improving Locality

Proceedings of the 2004 ACM/IEEE conference on Supercomputing
A brief survey of program slicing

ACM SIGSOFT Software Engineering Notes
Complementing software pipelining with software thread integration

LCTES '05 Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Wish Branches: Combining Conditional Branching and Predication for Adaptive Predicated Execution

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Wish Branches: Enabling Adaptive and Aggressive Predicated Execution

IEEE Micro
Code Analysis for Temporal Predictability

Real-Time Systems
2D-Profiling: Detecting Input-Dependent Branches with a Single Input Data Set

Proceedings of the International Symposium on Code Generation and Optimization
Compiling for EDGE Architectures

Proceedings of the International Symposium on Code Generation and Optimization
Reaching fast code faster: using modeling for efficient software thread integration on a VLIW DSP

CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Selective predicate prediction for out-of-order processors

Proceedings of the 20th annual international conference on Supercomputing
Diverge-Merge Processor (DMP): Dynamic Predicated Execution of Complex Control-Flow Graphs Based on Frequently Executed Paths

Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Merging Head and Tail Duplication for Convergent Hyperblock Formation

Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Dataflow Predication

Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Single-dimension software pipelining for multidimensional loops

ACM Transactions on Architecture and Code Optimization (TACO)
Ablego: a function outlining and partial inlining framework: Research Articles

Software—Practice & Experience
A 64-bit stream processor architecture for scientific applications

Proceedings of the 34th annual international symposium on Computer architecture
Profile-assisted Compiler Support for Dynamic Predication in Diverge-Merge Processors

Proceedings of the International Symposium on Code Generation and Optimization
Diverge-Merge Processor: Generalized and Energy-Efficient Dynamic Predication

IEEE Micro
Krakatoa: decompilation in java (dose bytecode reveal source?)

COOTS'97 Proceedings of the 3rd conference on USENIX Conference on Object-Oriented Technologies (COOTS) - Volume 3
Executing irregular scientific applications on stream architectures

Proceedings of the 21st annual international conference on Supercomputing
Enlarging Instruction Streams

IEEE Transactions on Computers
A time-predictable VLIW processor and its compiler support

Real-Time Systems
Improving the performance of object-oriented languages with dynamic predication of indirect jumps

Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Compiling for vector-thread architectures

Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
Implementation of a Coarse-Grained Reconfigurable Media Processor for AVC Decoder

Journal of Signal Processing Systems
Sketching concurrent data structures

Proceedings of the 2008 ACM SIGPLAN conference on Programming language design and implementation
Register allocation for software pipelined multidimensional loops

ACM Transactions on Programming Languages and Systems (TOPLAS)
Retargetable code optimization for predicated execution

Proceedings of the conference on Design, automation and test in Europe
On the exploitation of loop-level parallelism in embedded applications

ACM Transactions on Embedded Computing Systems (TECS)
Compiler Controlled Speculation for Power Aware ILP Extraction in Dataflow Architectures

HiPEAC '09 Proceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers
A SIMD optimization framework for retargetable compilers

ACM Transactions on Architecture and Code Optimization (TACO)
Optimizing techniques for saturated arithmetic with first-order linear recurrence

Proceedings of the 2009 ACM symposium on Applied Computing
Field Programmable Compressor Trees: Acceleration of Multi-Input Addition on FPGAs

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Synchronization optimizations for efficient execution on multi-cores

Proceedings of the 23rd international conference on Supercomputing
Modulo scheduling without overlapped lifetimes

Proceedings of the 2009 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Dynamic warp formation: Efficient MIMD control flow on SIMD graphics hardware

ACM Transactions on Architecture and Code Optimization (TACO)
Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation

Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation
Identifying task-level parallelism by functional transformation with side-effect domains

Proceedings of the 47th Annual Southeast Regional Conference
Applying Data Mapping Techniques to Vector DSPs

Journal of Signal Processing Systems
Equivalence Checking of Static Affine Programs Using Widening to Handle Recurrences

CAV '09 Proceedings of the 21st International Conference on Computer Aided Verification
A Single-Path Chip-Multiprocessor System

SEUS '09 Proceedings of the 7th IFIP WG 10.2 International Workshop on Software Technologies for Embedded and Ubiquitous Systems
The Fortran parallel transformer and its programming environment

Information Sciences: an International Journal
Compiling for reconfigurable computing: A survey

ACM Computing Surveys (CSUR)
MIRS: modulo scheduling with integrated register spilling

LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
An optimal linear-time algorithm for interprocedural register allocation in high level synthesis using SSA form

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Application of if-conversion to verification and optimization of workflows

Programming and Computing Software
AnySL: efficient and portable shading for ray tracing

Proceedings of the Conference on High Performance Graphics
How many threads to spawn during program multithreading?

LCPC'10 Proceedings of the 23rd international conference on Languages and compilers for parallel computing
A unifying theory of control dependence and its application to arbitrary program structures

Theoretical Computer Science
Boosting the performance of multimedia applications using SIMD instructions

CC'05 Proceedings of the 14th international conference on Compiler Construction
Verification of source code transformations by program equivalence checking

CC'05 Proceedings of the 14th international conference on Compiler Construction
Automatic detection of saturation and clipping idioms

LCPC'02 Proceedings of the 15th international conference on Languages and Compilers for Parallel Computing
Extending a C-like language for portable SIMD programming

Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming
Optimizing stencil application on multi-thread GPU architecture using stream programming model

ARCS'10 Proceedings of the 23rd international conference on Architecture of Computing Systems
Unrestricted code motion: a program representation and transformation algorithms based on future values

CC'10/ETAPS'10 Proceedings of the 19th joint European conference on Theory and Practice of Software, international conference on Compiler Construction
Strategies for predicate-aware register allocation

CC'10/ETAPS'10 Proceedings of the 19th joint European conference on Theory and Practice of Software, international conference on Compiler Construction
Whole-function vectorization

CGO '11 Proceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization
Improving performance of OpenCL on CPUs

CC'12 Proceedings of the 21st international conference on Compiler Construction
CAPRI: prediction of compaction-adequacy for handling control-divergence in GPGPU architectures

Proceedings of the 39th Annual International Symposium on Computer Architecture
Functional programs that explain their work

Proceedings of the 17th ACM SIGPLAN international conference on Functional programming
Spotting code optimizations in data-parallel pipelines through PeriSCOPE

OSDI'12 Proceedings of the 10th USENIX conference on Operating Systems Design and Implementation
Compiling for time predictability

SAFECOMP'12 Proceedings of the 2012 international conference on Computer Safety, Reliability, and Security
Interleaving and lock-step semantics for analysis and verification of GPU kernels

ESOP'13 Proceedings of the 22nd European conference on Programming Languages and Systems
Control-Flow Decoupling

MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Intermediate representations in imperative compilers: A survey

ACM Computing Surveys (CSUR)
Maximizing SIMD resource utilization in GPGPUs with SIMD lane permutation

Proceedings of the 40th Annual International Symposium on Computer Architecture
Software thread integration for instruction-level parallelism

ACM Transactions on Embedded Computing Systems (TECS)
Breaking SIMD shackles with an exposed flexible microarchitecture and the access execute PDG

PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Allocating rotating registers by scheduling

Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
Just-In-Time Software Pipelining

Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization
Sierra: a SIMD extension for C++

Proceedings of the 2014 Workshop on Programming models for SIMD/Vector processing

Quantified Score

Hi-index	0.02

Visualization

Abstract

Program analysis methods, especially those which support automatic vectorization, are based on the concept of interstatement dependence where a dependence holds between two statements when one of the statements computes values needed by the other. Powerful program transformation systems that convert sequential programs to a form more suitable for vector or parallel machines have been developed using this concept [AllK 82, KKLW 80].The dependence analysis in these systems is based on data dependence. In the presence of complex control flow, data dependence is not sufficient to transform programs because of the introduction of control dependences. A control dependence exists between two statements when the execution of one statement can prevent the execution of the other. Control dependences do not fit conveniently into dependence-based program translators.One solution is to convert all control dependences to data dependences by eliminating goto statements and introducing logical variables to control the execution of statements in the program. In this scheme, action statements are converted to IF statements. The variables in the conditional expression of an IF statement can be viewed as inputs to the statement being controlled. The result is that control dependences between statements become explicit data dependences expressed through the definitions and uses of the controlling logical variables.This paper presents a method for systematically converting control dependences to data dependences in this fashion. The algorithms presented here have been implemented in PFC, an experimental vectorizer written at Rice University.