Allocating Independent Subtasks on Parallel Processors
IEEE Transactions on Software Engineering
A global approach to detection of parallelism
A global approach to detection of parallelism
Guided self-scheduling: A practical scheduling scheme for parallel supercomputers
IEEE Transactions on Computers
Software pipelining: an effective scheduling technique for VLIW machines
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Compiling issues for supercomputers
Proceedings of the 1988 ACM/IEEE conference on Supercomputing
Proceedings of the 1st International Conference on Supercomputing
Journal of Parallel and Distributed Computing - Special issue: software tools for parallel programming and visualization
Parallel programming with coordination structures
POPL '91 Proceedings of the 18th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Delirium: an embedding coordination language
Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Efficiently computing static single assignment form and the control dependence graph
ACM Transactions on Programming Languages and Systems (TOPLAS)
Compiler optimizations for Fortran D on MIMD distributed-memory machines
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Factoring: a practical and robust method for scheduling parallel loops
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
A dynamic scheduling method for irregular parallel programs
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Partitioning parallel programs for macro-dataflow
LFP '86 Proceedings of the 1986 ACM conference on LISP and functional programming
SIGPLAN '84 Proceedings of the 1984 SIGPLAN symposium on Compiler construction
Dependence graphs and compiler optimizations
POPL '81 Proceedings of the 8th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Compiler transformations for high-performance computing
ACM Computing Surveys (CSUR)
Resource scheduling for parallel database and scientific applications
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Hi-index | 0.00 |
Many parallel programs contain multiple sub-computations, each with distinct communication and load balancing requirements. The traditional approach to compiling such programs is to impose a processor synchronization barrier between sub-computations, optimizing each as a separate entity. This paper develops a methodology for managing the interactions among sub-computations, avoiding strict synchronization where concurrent or pipelined relationships are possible.Our approach to compiling parallel programs has two components: symbolic data access analysis and adaptive runtime support. We summarize the data access behavior of sub-computations (such as loop nests) and split them to expose concurrency and pipelining opportunities. The split transformation has been incorporated into an extended FORTRAN compiler, which outputs a FORTRAN 77 program augmented with calls to library routines written in C and a coarse-grained dataflow graph summarizing the exposed parallelism.The compiler encodes symbolic information, including loop bounds and communication requirements, for an adaptive runtime system, which uses runtime information to improve the scheduling efficiency of irregular sub-computations. The runtime system incorporates algorithms that allocate processing resources to concurrently executing sub-computations and choose communication granularity. We have demonstrated that these dynamic techniques substantially improve performance on a range of production applications including climate modeling and x-ray tomography, expecially when large numbers of processors are available.