Trapezoid Self-Scheduling: A Practical Scheduling Scheme for Parallel Compilers

Authors:
T. H. Tzen;L. M. Ni
Affiliations:
-;-
Venue:
IEEE Transactions on Parallel and Distributed Systems
Year:
1993

Citing 2
Cited 63

Guided self-scheduling: A practical scheduling scheme for parallel supercomputers

IEEE Transactions on Computers
Design Tradeoffs for Process Scheduling in Shared Memory Multiprocessor Systems

IEEE Transactions on Software Engineering

Exploiting the parallelism available in loops

Computer
Combining static and dynamic scheduling on distributed-memory multiprocessors

ICS '94 Proceedings of the 8th international conference on Supercomputing
Impact of Memory Contention on Dynamic Scheduling on NUMA Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
Adaptively Scheduling Parallel Loops in Distributed Shared-Memory Systems

IEEE Transactions on Parallel and Distributed Systems
Space-efficient implementation of nested parallelism

PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Compile-time minimisation of load imbalance in loop nests

ICS '97 Proceedings of the 11th international conference on Supercomputing
Scheduling policies to support distributed 3D multimedia applications

SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Parallel Computing on an Ethernet Cluster of Workstations: Opportunities and Constraints

The Journal of Supercomputing
Space-efficient scheduling of nested parallelism

ACM Transactions on Programming Languages and Systems (TOPLAS)
Distributed message routing and run-time support for message-passing parallel programs derived from ordinary programs

SAC '94 Proceedings of the 1994 ACM symposium on Applied computing
Dynamic Task Scheduling Using Online Optimization

IEEE Transactions on Parallel and Distributed Systems
A comparative study of online scheduling algorithms for networks of workstations

Cluster Computing
Dependence Uniformization: A Loop Parallelization Technique

IEEE Transactions on Parallel and Distributed Systems
Using Processor Affinity in Loop Scheduling on Shared-Memory Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
Load Balancing Highly Irregular Computations with the Adaptive Factoring

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Performance of Scheduling Scientific Applications with Adaptive Weighted Factoring

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
A Theoretical Application of Feedback Guided Dynamic Loop Scheduling

IWCC '01 Proceedings of the NATO Advanced Research Workshop on Advanced Environments, Tools, and Applications for Cluster Computing-Revised Papers
Feedback Guided Scheduling of Nested Loops

PARA '00 Proceedings of the 5th International Workshop on Applied Parallel Computing, New Paradigms for HPC in Industry and Academia
A Semi-dynamic Multiprocessor Scheduling Algorithm with an Asymptotically Optimal Competitive Ratio

Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Scheduling at Twilight the Easy Way

STACS '02 Proceedings of the 19th Annual Symposium on Theoretical Aspects of Computer Science
Adaptive Computing on the Grid Using AppLeS

IEEE Transactions on Parallel and Distributed Systems
Runtime Empirical Selection of Loop Schedulers on Hyperthreaded SMPs

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Shared memory multiprocessor support for functional array processing in SAC

Journal of Functional Programming
An Enhanced Parallel Loop Self-Scheduling Scheme for Cluster Environments

The Journal of Supercomputing
Design and implementation of a novel dynamic load balancing library for cluster computing

Parallel Computing - Heterogeneous computing
Feedback guided dynamic loop scheduling: convergence of the continuous case

The Journal of Supercomputing - Special issue: Parallel and distributed processing and applications
PackageBLAST: an adaptive multi-policy grid service for biological sequence comparison

Proceedings of the 2006 ACM symposium on Applied computing
New Scheduling Strategies for Randomized Incremental Algorithms in the Context of Speculative Parallelization

IEEE Transactions on Computers
Memory bank aware dynamic loop scheduling

Proceedings of the conference on Design, automation and test in Europe
On development of an efficient parallel loop self-scheduling for grid computing environments

Parallel Computing
A performance-based parallel loop scheduling on grid environments

The Journal of Supercomputing
Enhancing self-scheduling algorithms via synchronization and weighting

Journal of Parallel and Distributed Computing
Dynamic partitioning of loop iterations on heterogeneous PC clusters

The Journal of Supercomputing
Dynamic load balancing with adaptive factoring methods in scientific applications

The Journal of Supercomputing
Scalable loop self-scheduling schemes for heterogeneous clusters

International Journal of Computational Science and Engineering
Performance evaluation of a dynamic load-balancing library for cluster computing

International Journal of Computational Science and Engineering
A practical application of FGDLS to birds flock trajectory

ICCOMP'05 Proceedings of the 9th WSEAS International Conference on Computers
Derivation of self-scheduling algorithms for heterogeneous distributed computer systems: Application to internet-based grids of computers

Future Generation Computer Systems
Implementation of a Performance-Based Loop Scheduling on Heterogeneous Clusters

ICA3PP '09 Proceedings of the 9th International Conference on Algorithms and Architectures for Parallel Processing
A directive-based MPI code generator for Linux PC clusters

The Journal of Supercomputing
An adaptive multi-policy grid service for biological sequence comparison

Journal of Parallel and Distributed Computing
A parallel loop self-scheduling on extremely heterogeneous PC clusters

ICCS'03 Proceedings of the 2003 international conference on Computational science
Performance-based workload distribution on grid environments

GPC'07 Proceedings of the 2nd international conference on Advances in grid and pervasive computing
Is the schedule clause really necessary in OpenMP?

WOMPAT'03 Proceedings of the OpenMP applications and tools 2003 international conference on OpenMP shared memory parallel programming
Particle swarm optimisation based Diophantine equation solver

International Journal of Bio-Inspired Computation
Studying the impact of synchronization frequency on scheduling tasks with dependencies in heterogeneous systems

Performance Evaluation
A mltiple task allocation frame work for biological seqence comparision in a grid environment

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Dynamic multi phase scheduling for heterogeneous cluste

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Distributed dynamic load balancing for pipelined computations on heterogeneous systems

Parallel Computing
A new carried-dependence self-scheduling algorithm

ICCSA'05 Proceedings of the 2005 international conference on Computational Science and its Applications - Volume Part I
A performance-based parallel loop self-scheduling on grid computing environments

NPC'05 Proceedings of the 2005 IFIP international conference on Network and Parallel Computing
Convergence of the discrete FGDLS algorithm

HPCC'05 Proceedings of the First international conference on High Performance Computing and Communications
A hybrid parallel loop scheduling scheme on grid environments

GCC'05 Proceedings of the 4th international conference on Grid and Cooperative Computing
A performance-based approach to dynamic workload distribution for master-slave applications on grid environments

GPC'06 Proceedings of the First international conference on Advances in Grid and Pervasive Computing
Using hybrid MPI and OpenMP programming to optimize communications in parallel loop self-scheduling schemes for multicore PC clusters

The Journal of Supercomputing
Automatic OpenMP loop scheduling: a combined compiler and runtime approach

IWOMP'12 Proceedings of the 8th international conference on OpenMP in a Heterogeneous World
Performance evaluation of enhancement of the layered self-scheduling approach for heterogeneous multicore cluster systems

The Journal of Supercomputing
Accelerating MapReduce on a coupled CPU-GPU architecture

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Using analytical models to load balancing in a heterogeneous network of computers

PaCT'07 Proceedings of the 9th international conference on Parallel Computing Technologies
Distributing fixed time slices in heterogeneous networks of workstations (NOWs)

ISPA'07 Proceedings of the 5th international conference on Parallel and Distributed Processing and Applications
Towards the optimal synchronization granularity for dynamic scheduling of pipelined computations on heterogeneous computing systems

Concurrency and Computation: Practice & Experience
Multiple biological sequence alignment in heterogeneous multicore clusters with user-selectable task allocation policies

The Journal of Supercomputing
Load balancing in a changing world: dealing with heterogeneity and performance variability

Proceedings of the ACM International Conference on Computing Frontiers

Quantified Score

Hi-index	0.01

Visualization

Abstract

A practical processor self-scheduling scheme, trapezoid self-scheduling, is proposed for arbitrary parallel nested loops in shared-memory multiprocessors. Generally, loops are the richest source of parallelism in parallel programs. To dynamically allocate loop iterations to processors, one may achieve load balancing among processors at the expense of run-time scheduling overhead. By linearly decreasing the chunk size at run time, the best tradeoff between the scheduling overhead and balanced workload can be obtained in the proposed trapezoid self-scheduling approach. Due to its simplicity and flexibility, this approach can be efficiently implemented in any parallel compiler. The small and predictable number of chores also allow efficient management of memory in a static fashion. The experiments conducted in a 96-node Butterfly GP-1000 clearly show the advantage of the trapezoid self-scheduling over other well-known self-scheduling approaches.