Allocating programs containing branches and loops within a multiple processor system
IEEE Transactions on Software Engineering
Scheduling Multiprocessor Tasks to Minimize Schedule Length
IEEE Transactions on Computers
A Partitioning Strategy for Nonuniform Problems on Multiprocessors
IEEE Transactions on Computers
Nearest-neighbor mapping of finite element graphs onto processor meshes
IEEE Transactions on Computers
Solving problems on concurrent processors. Vol. 1: General techniques and regular problems
Solving problems on concurrent processors. Vol. 1: General techniques and regular problems
Partitioning Problems in Parallel, Pipeline, and Distributed Computing
IEEE Transactions on Computers
Scheduling algorithms for PIPE (Pipelined Image-Processing Engine)
Journal of Parallel and Distributed Computing
Minimal Mesh Embeddings in Binary Hypercubes
IEEE Transactions on Computers
On Embedding Rectangular Grids in Hypercubes
IEEE Transactions on Computers
Characterizations of parallelism in applications and their use in scheduling
SIGMETRICS '89 Proceedings of the 1989 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Complexity of scheduling parallel task systems
SIAM Journal on Discrete Mathematics
Utilizing Multidimensional Loop Parallelism on Large Scale Parallel Processor Systems
IEEE Transactions on Computers
Embedding Rectangular Grids into Square Grids with Dilation Two
IEEE Transactions on Computers
Dynamic partitioning in a transputer environment
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
A multistage linear array assignment problem
Operations Research
The DARPA image understanding benchmark for parallel computers
Journal of Parallel and Distributed Computing
Improved Algorithms for Mapping Pipelined and Parallel Computations
IEEE Transactions on Computers
Optimal Partitioning of Cache Memory
IEEE Transactions on Computers
Dynamic Programming: Models and Applications
Dynamic Programming: Models and Applications
Computer Algorithms: C++
Parallel Architectures and Parallel Algorithms for Integrated Vision Systems
Parallel Architectures and Parallel Algorithms for Integrated Vision Systems
Motion Understanding: Robot and Human Vision
Motion Understanding: Robot and Human Vision
On Mapping Systolic Algorithms onto the Hypercube
IEEE Transactions on Parallel and Distributed Systems
Pipelined Data Parallel Algorithms-II: Design
IEEE Transactions on Parallel and Distributed Systems
Optimal mapping of sequences of data parallel tasks
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Optimal latency-throughput tradeoffs for data parallel pipelines
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Compilation of parallel multimedia computations—extending retiming theory and Amdahl's law
PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Performance Metrics for Embedded Parallel Pipelines
IEEE Transactions on Parallel and Distributed Systems
Double standards: bringing task parallelism to HPF via the message passing interface
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Communication and memory requirements as the basis for mapping task and data parallel programs
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Task Parallelism in a High Performance Fortran Framework
IEEE Parallel & Distributed Technology: Systems & Technology
An integer programming approach for static mapping onto heterogeneous real-time systems
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
A fast task-to-processor assignment heuristic for real-time multiprocessor DSP applications
Computers and Operations Research
Design, Implementation and Evaluation of Parallel Pipelined STAP on Parallel Computers
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
A Note on System-on-Chip Test Scheduling Formulation
Journal of Electronic Testing: Theory and Applications
Performance Evaluation of a Parallel Pipeline Computational Model for Space-Time Adaptive Processing
The Journal of Supercomputing
Automatically partitioning packet processing applications for pipelined architectures
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Scheduling Recurrent Precedence-Constrained Task Graphs on a Symmetric Shared-Memory Multiprocessor
Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
A throughput-driven task creation and mapping for network processors
HiPEAC'07 Proceedings of the 2nd international conference on High performance embedded architectures and compilers
Optimizing latency and throughput of application workflows on clusters
Parallel Computing
On-line scheduling of multi-core processor tasks with virtualization
Operations Research Letters
A survey of pipelined workflow scheduling: Models and algorithms
ACM Computing Surveys (CSUR)
Hi-index | 0.00 |
The availability of large-scale multitasked parallel architectures introduces the followingprocessor assignment problem. We are given a long sequence of data sets, each of whichis to undergo processing by a collection of tasks whose intertask data dependencies forma series-parallel partial order. Each individual task is potentially parallelizable, with aknown experimentally determined execution signature. Recognizing that data sets can bepipelined through the task structure, the problem is to find a "good" assignment ofprocessors to tasks. Two objectives interest us: minimal response time per data set,given a throughput requirement, and maximal throughput, given a response timerequirement. Our approach is to decompose a series-parallel task system into its essential"serial" and "parallel" components; our problem admits the independent solution andrecomposition of each such component. We provide algorithms for the series analysis, and use an algorithm due to Krishnamurti and Ma for the parallel analysis. For a p processor system and a series-parallel precedence graph with n constituent tasks, we give a O(np/sup 2/) algorithm that finds the optimal assignment (over a broad class ofassignments) for the response time optimization problem; we find the assignmentoptimizing the constrained throughput in O(np/sup 2/ log p) time. These techniques areapplied to a task system in computer vision.