Extracting task-level parallelism

Authors:
Milind Girkar;Constantine D. Polychronopoulos
Affiliations:
Sun Microsystems Inc., 2550 Garcia Ave., MS MTV12-40, Mountain View, CA;Center for Supercomputing Research and Development, University of Illinols at Urbana-Champaign, Urbana, IL
Venue:
ACM Transactions on Programming Languages and Systems (TOPLAS)
Year:
1995

Citing 20
Cited 11

Compilers: principles, techniques, and tools

Compilers: principles, techniques, and tools
The program dependence graph and its use in optimization

ACM Transactions on Programming Languages and Systems (TOPLAS)
Automatic translation of FORTRAN programs to vector form

ACM Transactions on Programming Languages and Systems (TOPLAS)
Compiler algorithms for synchronization

IEEE Transactions on Computers
Static analysis of low-level synchronization

PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
Automatic generation of DAG parallelism

PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Parafrase-2: an environment for parallelizing, partitioning, synchronizing, and scheduling programs on multiprocessors

International Journal of High Speed Computing
The hierarchical task graph and its use in auto-scheduling

ICS '91 Proceedings of the 5th international conference on Supercomputing
Functional parallelism: theoretical foundations and implementation

Functional parallelism: theoretical foundations and implementation
A hierarchical approach to instruction-level parallelization

International Journal of Parallel Programming
Supercomputer performance evaluation and the Perfect Benchmarks

ICS '90 Proceedings of the 4th international conference on Supercomputing
Programmers use slices when debugging

Communications of the ACM
Optimizing Supercompilers for Supercomputers

Optimizing Supercompilers for Supercomputers
Dependence Analysis for Supercomputing

Dependence Analysis for Supercomputing
Dependence graphs and compiler optimizations

POPL '81 Proceedings of the 8th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Automatic Extraction of Functional Parallelism from Ordinary Programs

IEEE Transactions on Parallel and Distributed Systems
Compiler Transformations for High-Performance Computing

Compiler Transformations for High-Performance Computing
Dependence analysis for subscripted variables and its application to program transformations

Dependence analysis for subscripted variables and its application to program transformations
Optimizing supercompilers for supercomputers

Optimizing supercompilers for supercomputers

Hypersequential Programming: A New Way to Develop Concurrent Programs

IEEE Parallel & Distributed Technology: Systems & Technology
A Parallelization Domain Oriented Multilevel Graph Partitioner

IEEE Transactions on Computers
A compile-time optimization framework for Ada rendezvous

ACM SIGPLAN Notices
Using speculative computation and parallelizing techniques to improve scheduling of control based designs

ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
Speculative thread decomposition through empirical optimization

Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
A program auto-parallelizer based on the component technology of optimizing compiler construction

Programming and Computing Software
Elastic computing: a framework for transparent, portable, and adaptive multi-core heterogeneous computing

Proceedings of the ACM SIGPLAN/SIGBED 2010 conference on Languages, compilers, and tools for embedded systems
Finding, expressing and managing parallelism in programs executed on clusters of workstations

Computer Communications
Elastic computing: A portable optimization framework for hybrid computers

Parallel Computing
The RACECAR heuristic for automatic function specialization on multi-core heterogeneous systems

Proceedings of the 2012 international conference on Compilers, architectures and synthesis for embedded systems
Throughput-oriented kernel porting onto FPGAs

Proceedings of the 50th Annual Design Automation Conference

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automatic detection of task-level parallelism (also referred to as functional, DAG, unstructured, or thread parallelism) at various levels of program granularity is becoming increasingly important for parallelizing and back-end compilers. Parallelizing compilers detect iteration-level or coarser granularity parallelism which is suitable for parallel computers; detection of parallelism at the statement-or operation-level is essential for most modern microprocessors, including superscalar and VLIW architectures. In this article we study the problem of detecting, expressing, and optimizing task-level parallelism, where “task” refers to a program statement of arbitrary granularity. Optimizing the amount of functional parallelism (by allowing synchronization between arbitrary nodes) in sequential programs requires the notion of precedence in terms of paths in graphs which incorporate control and data dependences. Precedences have been defined before in a different context; however, the definition was dependent on the ideas of parallel execution and time. We show that the problem of determining precedences statically is NP-complete. Determining precedence relationships is useful in finding the essential data dependences. We show that there exists a unique minimum set of essential data dependences; finding this minimum set is NP-hard and NP-easy. We also propose a heuristic algorithm for finding the set of essential data dependences. Static analysis of a program in the Perfect Benchmarks was done, and we present some experimental results.