Assignment problems in parallel and distributed computing
Assignment problems in parallel and distributed computing
FFTs in external or hierarchical memory
The Journal of Supercomputing
Coarse-grain parallel programming in Jade
PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Improved Algorithms for Mapping Pipelined and Parallel Computations
IEEE Transactions on Computers
Automatic mapping of large signal processing systems to a parallel machine
Automatic mapping of large signal processing systems to a parallel machine
Computational frameworks for the fast Fourier transform
Computational frameworks for the fast Fourier transform
Exploiting task and data parallelism on a multicomputer
PPOPP '93 Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming
Latency and bandwidth considerations in parallel robotics image processing
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Preliminary experiences with the Fortran D compiler
Proceedings of the 1993 ACM/IEEE conference on Supercomputing
An integrated runtime and compile-time approach for parallelizing structured and block structured applications
Supporting systolic and memory communication in iWarp
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Partitioning and Scheduling Parallel Programs for Multiprocessors
Partitioning and Scheduling Parallel Programs for Multiprocessors
Task Parallelism in a High Performance Fortran Framework
IEEE Parallel & Distributed Technology: Systems & Technology
IEEE Transactions on Pattern Analysis and Machine Intelligence
Optimal Processor Assignment for a Class of Pipelined Computations
IEEE Transactions on Parallel and Distributed Systems
Automatic Mapping of Task and Data Parallel Programs for Efficient Execution on Multicomputers
Automatic Mapping of Task and Data Parallel Programs for Efficient Execution on Multicomputers
A SOFTWARE ARCHITECTURE FOR MULTIDISCIPLINARY APPICATIONS: INTEGRATING TASK AND DATA PARALLELISM
A SOFTWARE ARCHITECTURE FOR MULTIDISCIPLINARY APPICATIONS: INTEGRATING TASK AND DATA PARALLELISM
Flattening and parallelizing irregular, recurrent loop nests
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Optimal mapping of sequences of data parallel tasks
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Optimal latency-throughput tradeoffs for data parallel pipelines
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
A new model for integrated nested task and data parallel programming
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Compilation of parallel multimedia computations—extending retiming theory and Amdahl's law
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
A Framework for Exploiting Task and Data Parallelism on Distributed Memory Multicomputers
IEEE Transactions on Parallel and Distributed Systems
Task Parallelism in a High Performance Fortran Framework
IEEE Parallel & Distributed Technology: Systems & Technology
Detection of Implicit Parallelisms in the Task Parallel Language
HPC-ASIA '97 Proceedings of the High-Performance Computing on the Information Superhighway, HPC-Asia '97
Decentralizing execution of composite web services
OOPSLA '04 Proceedings of the 19th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Automatically partitioning packet processing applications for pipelined architectures
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
The Cactus Worm: Experiments with Dynamic Resource Discovery and Allocation in a Grid Environment
International Journal of High Performance Computing Applications
Automatic choice of scheduling heuristics for parallel/distributed computing
Scientific Programming
Flexible skeletal programming with eskel
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Two fundamental concepts in skeletal parallel programming
ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part II
Hi-index | 0.00 |
For a wide variety of applications, both task and data parallelism must be exploited to achieve the best possible performance on a multicomputer. Recent research has underlined the importance of exploiting task and data parallelism in a single compiler framework, and such a compiler can map a single source program in many different ways onto a parallel machine. The tradeoffs between task and data parallelism are complex and depend on the characteristics of the program to be executed, most significantly the memory and communication requirements, and the performance parameters of the target parallel machine. In this paper, we present a framework to isolate and examine the specific characteristics of programs that determine the performance for different mappings. Our focus is on applications that process a stream of input, and whose computation structure is fairly static and predictable. We describe three such applications that were developed with our compiler: fast Fourier transforms, narrowband tracking radar, and multibaseline stereo. We examine the tradeoffs between various mappings for them and show how the framework is used to obtain efficient mappings.