Compilers: principles, techniques, and tools
Compilers: principles, techniques, and tools
The program dependence graph and its use in optimization
ACM Transactions on Programming Languages and Systems (TOPLAS)
Automatic translation of FORTRAN programs to vector form
ACM Transactions on Programming Languages and Systems (TOPLAS)
Compiler algorithms for synchronization
IEEE Transactions on Computers
Static analysis of low-level synchronization
PADD '88 Proceedings of the 1988 ACM SIGPLAN and SIGOPS workshop on Parallel and distributed debugging
Automatic generation of DAG parallelism
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
International Journal of High Speed Computing
The hierarchical task graph and its use in auto-scheduling
ICS '91 Proceedings of the 5th international conference on Supercomputing
Functional parallelism: theoretical foundations and implementation
Functional parallelism: theoretical foundations and implementation
A hierarchical approach to instruction-level parallelization
International Journal of Parallel Programming
Supercomputer performance evaluation and the Perfect Benchmarks
ICS '90 Proceedings of the 4th international conference on Supercomputing
Programmers use slices when debugging
Communications of the ACM
Optimizing Supercompilers for Supercomputers
Optimizing Supercompilers for Supercomputers
Dependence Analysis for Supercomputing
Dependence Analysis for Supercomputing
Dependence graphs and compiler optimizations
POPL '81 Proceedings of the 8th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Automatic Extraction of Functional Parallelism from Ordinary Programs
IEEE Transactions on Parallel and Distributed Systems
Compiler Transformations for High-Performance Computing
Compiler Transformations for High-Performance Computing
Dependence analysis for subscripted variables and its application to program transformations
Dependence analysis for subscripted variables and its application to program transformations
Optimizing supercompilers for supercomputers
Optimizing supercompilers for supercomputers
Hypersequential Programming: A New Way to Develop Concurrent Programs
IEEE Parallel & Distributed Technology: Systems & Technology
A Parallelization Domain Oriented Multilevel Graph Partitioner
IEEE Transactions on Computers
A compile-time optimization framework for Ada rendezvous
ACM SIGPLAN Notices
ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
Speculative thread decomposition through empirical optimization
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
A program auto-parallelizer based on the component technology of optimizing compiler construction
Programming and Computing Software
Proceedings of the ACM SIGPLAN/SIGBED 2010 conference on Languages, compilers, and tools for embedded systems
Finding, expressing and managing parallelism in programs executed on clusters of workstations
Computer Communications
Elastic computing: A portable optimization framework for hybrid computers
Parallel Computing
The RACECAR heuristic for automatic function specialization on multi-core heterogeneous systems
Proceedings of the 2012 international conference on Compilers, architectures and synthesis for embedded systems
Throughput-oriented kernel porting onto FPGAs
Proceedings of the 50th Annual Design Automation Conference
Hi-index | 0.00 |
Automatic detection of task-level parallelism (also referred to as functional, DAG, unstructured, or thread parallelism) at various levels of program granularity is becoming increasingly important for parallelizing and back-end compilers. Parallelizing compilers detect iteration-level or coarser granularity parallelism which is suitable for parallel computers; detection of parallelism at the statement-or operation-level is essential for most modern microprocessors, including superscalar and VLIW architectures. In this article we study the problem of detecting, expressing, and optimizing task-level parallelism, where “task” refers to a program statement of arbitrary granularity. Optimizing the amount of functional parallelism (by allowing synchronization between arbitrary nodes) in sequential programs requires the notion of precedence in terms of paths in graphs which incorporate control and data dependences. Precedences have been defined before in a different context; however, the definition was dependent on the ideas of parallel execution and time. We show that the problem of determining precedences statically is NP-complete. Determining precedence relationships is useful in finding the essential data dependences. We show that there exists a unique minimum set of essential data dependences; finding this minimum set is NP-hard and NP-easy. We also propose a heuristic algorithm for finding the set of essential data dependences. Static analysis of a program in the Perfect Benchmarks was done, and we present some experimental results.