Guided self-scheduling: A practical scheduling scheme for parallel supercomputers

Authors:
C. D. Polychronopoulos;D. J. Kuck
Affiliations:
Univ. of Illinois at Urbana-Champaign, Urbana, IL;Univ. of Illinois at Urbana-Champaign, Urbana, IL
Venue:
IEEE Transactions on Computers
Year:
1987

Citing 15
Cited 182

Allocating Independent Subtasks on Parallel Processors

IEEE Transactions on Software Engineering
Processor Allocation for Horizontal and Vertical Parallelism and Related Speedup Bounds

IEEE Transactions on Computers
A data-flow approach to multitasking on CRAY X-MP computers

Proceedings of the tenth ACM symposium on Operating systems principles
Dependence graphs and compiler optimizations

POPL '81 Proceedings of the 8th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Structure of Computers and Computations

Structure of Computers and Computations
Speedup of ordinary programs

Speedup of ordinary programs
Multiprocessors: discussion of some theoretical and practical problems

Multiprocessors: discussion of some theoretical and practical problems
Optimizing supercompilers for supercomputers

Optimizing supercompilers for supercomputers
On program restructuring, scheduling, and communication for parallel processor systems

On program restructuring, scheduling, and communication for parallel processor systems
Principles of Compiler Design (Addison-Wesley series in computer science and information processing)

Principles of Compiler Design (Addison-Wesley series in computer science and information processing)
The NYU Ultracomputer Designing an MIMD Shared Memory Parallel Computer

IEEE Transactions on Computers
Multiprocessor Scheduling with the Aid of Network Flow Algorithms

IEEE Transactions on Software Engineering
Bounds on multiprocessing anomalies and related packing algorithms

AFIPS '72 (Spring) Proceedings of the May 16-18, 1972, spring joint computer conference
Scheduling Multipipeline and Multiprocessor Computers

IEEE Transactions on Computers

Compiler Optimizations for Enhancing Parallelism and Their Impact on Architecture Design

IEEE Transactions on Computers - Special issue on architectural support for programming languages and operating systems
Partitioning programs for parallel execution

ICS '88 Proceedings of the 2nd international conference on Supercomputing
Impact of self-scheduling order on performance on multiprocessor systems

ICS '88 Proceedings of the 2nd international conference on Supercomputing
On the combination of hardware and software concurrency extraction methods

ACM SIGMICRO Newsletter
Compiling issues for supercomputers

Proceedings of the 1988 ACM/IEEE conference on Supercomputing
The convex C240 architecture

Proceedings of the 1988 ACM/IEEE conference on Supercomputing
The fuzzy barrier: a mechanism for high speed synchronization of processors

ASPLOS III Proceedings of the third international conference on Architectural support for programming languages and operating systems
Run-time parallelization and scheduling of loops

SPAA '89 Proceedings of the first annual ACM symposium on Parallel algorithms and architectures
Utilizing Multidimensional Loop Parallelism on Large Scale Parallel Processor Systems

IEEE Transactions on Computers
Processor scheduling in shared memory multiprocessors

SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Run-Time Parallelization and Scheduling of Loops

IEEE Transactions on Computers
Removal of redundant dependences in DOACROSS loops with constant dependences

PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Exploitation of APL data parallelism on a shared-memory MIMD machine

PPOPP '91 Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming
Switch-stacks: a scheme for microtasking nested parallel loops

Proceedings of the 1990 ACM/IEEE conference on Supercomputing
Dynamic Processor Self-Scheduling for General Parallel Nested Loops

IEEE Transactions on Computers
Factoring: a practical and robust method for scheduling parallel loops

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Factoring: a method for scheduling parallel loops

Communications of the ACM
Low-overhead scheduling of nested parallelism

IBM Journal of Research and Development
Automatic partitioning of a program dependence graph into parallel tasks

IBM Journal of Research and Development
A general framework for iteration-reordering loop transformations

PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
A dynamic scheduling method for irregular parallel programs

PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Processor allocation and loop scheduling on multiprocessor computers

ICS '92 Proceedings of the 6th international conference on Supercomputing
Using processor affinity in loop scheduling on shared-memory multiprocessors

Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Chores: enhanced run-time support for shared-memory parallel computing

ACM Transactions on Computer Systems (TOCS)
Orchestrating interactions among parallel computations

PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Managing pages in shared virtual memory systems: getting the compiler into the game

ICS '93 Proceedings of the 7th international conference on Supercomputing
Self-scheduling on distributed-memory machines

Proceedings of the 1993 ACM/IEEE conference on Supercomputing
Exploiting the parallelism available in loops

Computer
Parallel programming with control abstraction

ACM Transactions on Programming Languages and Systems (TOPLAS)
Job scheduling in rings

SPAA '94 Proceedings of the sixth annual ACM symposium on Parallel algorithms and architectures
Combining static and dynamic scheduling on distributed-memory multiprocessors

ICS '94 Proceedings of the 8th international conference on Supercomputing
Compiler techniques for maximizing fine-grain and coarse-grain parallelism in loops with uniform dependences

ICS '94 Proceedings of the 8th international conference on Supercomputing
Impact of sharing-based thread placement on multithreaded architectures

ISCA '94 Proceedings of the 21st annual international symposium on Computer architecture
The effectiveness of multiple hardware contexts

ASPLOS VI Proceedings of the sixth international conference on Architectural support for programming languages and operating systems
Compiler transformations for high-performance computing

ACM Computing Surveys (CSUR)
A Hierarchical Task Queue Organization for Shared-Memory Multiprocessor Systems

IEEE Transactions on Parallel and Distributed Systems
The CRAFT Fortran programming model

Scientific Programming
Balancing processor loads and exploiting data locality in N-body simulations

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Commutativity analysis: a new analysis framework for parallelizing compilers

PLDI '96 Proceedings of the ACM SIGPLAN 1996 conference on Programming language design and implementation
Symbolic analysis for parallelizing compilers

ACM Transactions on Programming Languages and Systems (TOPLAS)
Load-sharing in heterogeneous systems via weighted factoring

Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
A template for non-uniform parallel loops based on dynamic scheduling and prefetching techniques

ICS '96 Proceedings of the 10th international conference on Supercomputing
Impact of Memory Contention on Dynamic Scheduling on NUMA Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
Compiler techniques for data synchronization in nested parallel loops

ICS '90 Proceedings of the 4th international conference on Supercomputing
Parallelization of FORTRAN code on distributed-memory parallel processors

ICS '90 Proceedings of the 4th international conference on Supercomputing
On the combination of hardware and software concurrency extraction methods

MICRO 20 Proceedings of the 20th annual workshop on Microprogramming
Adaptively Scheduling Parallel Loops in Distributed Shared-Memory Systems

IEEE Transactions on Parallel and Distributed Systems
Space-efficient implementation of nested parallelism

PPOPP '97 Proceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming
Compile-time minimisation of load imbalance in loop nests

ICS '97 Proceedings of the 11th international conference on Supercomputing
Commutativity analysis: a new analysis technique for parallelizing compilers

ACM Transactions on Programming Languages and Systems (TOPLAS)
Dynamic scheduling with incomplete information

Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Application level scheduling of gene sequence comparison on metacomputers

ICS '98 Proceedings of the 12th international conference on Supercomputing
Dependence driven execution for multiprogrammed multiprocessor

ICS '98 Proceedings of the 12th international conference on Supercomputing
Scheduling policies to support distributed 3D multimedia applications

SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
An Efficient Solution to the Cache Thrashing Problem Caused by True Data Sharing

IEEE Transactions on Computers
On Exploiting Task Duplication in Parallel Program Scheduling

IEEE Transactions on Parallel and Distributed Systems
SMARTS: exploiting temporal locality and parallelism through vertical execution

ICS '99 Proceedings of the 13th international conference on Supercomputing
Parallel Computing on an Ethernet Cluster of Workstations: Opportunities and Constraints

The Journal of Supercomputing
Eliminating synchronization overhead in automatically parallelized programs using dynamic feedback

ACM Transactions on Computer Systems (TOCS)
Space-efficient scheduling of nested parallelism

ACM Transactions on Programming Languages and Systems (TOPLAS)
The doconsider loop

ICS '89 Proceedings of the 3rd international conference on Supercomputing
An efficient message-passing scheduler based on guided self scheduling

ICS '89 Proceedings of the 3rd international conference on Supercomputing
The impact of synchronization and granularity on parallel systems

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Generating a deterministic task migration path for multiprocessor scheduling

SAC '94 Proceedings of the 1994 ACM symposium on Applied computing
Distributed message routing and run-time support for message-passing parallel programs derived from ordinary programs

SAC '94 Proceedings of the 1994 ACM symposium on Applied computing
Performance prediction based loop scheduling for heterogeneous computing environment

SAC '97 Proceedings of the 1997 ACM symposium on Applied computing
Cacheminer: A Runtime Approach to Exploit Cache Locality on SMP

IEEE Transactions on Parallel and Distributed Systems
Exploiting Wavefront Parallelism on Large-Scale Shared-Memory Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
Dynamic adaptation to available resources for parallel computing in an autonomous network of workstations

PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
Optimal tiling for minimizing communication in distributed shared-memory multiprocessors

Compiler optimizations for scalable parallel systems
A comparative study of online scheduling algorithms for networks of workstations

Cluster Computing
Affinity scheduling of unbalanced workloads

Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Runtime vs. Manual Data Distribution for Architecture-Agnostic Shared-Memory Programming Models

International Journal of Parallel Programming
Using Program Visualization for Tuning Parallel-Loop Scheduling

IEEE Parallel & Distributed Technology: Systems & Technology
Exploiting Parallelism Across Program Execution: A Unification Technique and Its Analysis

IEEE Transactions on Parallel and Distributed Systems
The Effect of Scheduling Discipline on Spin Overhead in Shared Memory Parallel Systems

IEEE Transactions on Parallel and Distributed Systems
Removal of Redundant Dependences in DOACROSS Loops with Constant Dependences

IEEE Transactions on Parallel and Distributed Systems
Efficient Processor Assignment Algorithms and Loop Transformations for Executing Nested Parallel Loops on Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
Partitioning and Labeling of Loops by Unimodular Transformations

IEEE Transactions on Parallel and Distributed Systems
Synchronization and Communication Costs of Loop Partitioning on Shared-Memory Multiprocessor Systems

IEEE Transactions on Parallel and Distributed Systems
Trapezoid Self-Scheduling: A Practical Scheduling Scheme for Parallel Compilers

IEEE Transactions on Parallel and Distributed Systems
Dependence Uniformization: A Loop Parallelization Technique

IEEE Transactions on Parallel and Distributed Systems
Using Processor Affinity in Loop Scheduling on Shared-Memory Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
The Impact of Parallel Loop Scheduling Strategies on Prefetching in a Shared Memory Multiprocessor

IEEE Transactions on Parallel and Distributed Systems
Automatic Partitioning of Parallel Loops and Data Arrays for Distributed Shared-Memory Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
Eliminating synchronization bottlenecks using adaptive replication

ACM Transactions on Programming Languages and Systems (TOPLAS)
Dynamic Scheduling Parallel Loops with Variable Iterate Execution Times

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Load Balancing Highly Irregular Computations with the Adaptive Factoring

IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Performance of Scheduling Scientific Applications with Adaptive Weighted Factoring

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Optimized Execution of Fortran 90 Array Language on Symmetric Shared-Memory Multiprocessors

LCPC '98 Proceedings of the 11th International Workshop on Languages and Compilers for Parallel Computing
A Theoretical Application of Feedback Guided Dynamic Loop Scheduling

IWCC '01 Proceedings of the NATO Advanced Research Workshop on Advanced Environments, Tools, and Applications for Cluster Computing-Revised Papers
Feedback Guided Scheduling of Nested Loops

PARA '00 Proceedings of the 5th International Workshop on Applied Parallel Computing, New Paradigms for HPC in Industry and Academia
Load Balancing for Minimizing Execution Time of a Target Job on a Network of Heterogeneous Workstations

IPDPS '00/JSSPP '00 Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing
A Semi-dynamic Multiprocessor Scheduling Algorithm with an Asymptotically Optimal Competitive Ratio

Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Scheduling User-Level Threads on Distributed Shared-Memory Multiprocessors

Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Multiple-Robot Motion Planning = Parallel Processing + Geometry

Revised Papers from the International Workshop on Sensor Based Intelligent Robots
Optimal, Distributed Decision-Making: The Case of No Communication

FCT '99 Proceedings of the 12th International Symposium on Fundamentals of Computation Theory
Compiler and Run-Time Support for Adaptive Load Balancing in Software Distributed Shared Memory Systems

LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Loop Transformations for Hierarchical Parallelism and Locality

LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Scheduling at Twilight the Easy Way

STACS '02 Proceedings of the 19th Annual Symposium on Theoretical Aspects of Computer Science
Adaptive Computing on the Grid Using AppLeS

IEEE Transactions on Parallel and Distributed Systems
Automatic parallelization for symmetric shared-memory multiprocessors

CASCON '96 Proceedings of the 1996 conference of the Centre for Advanced Studies on Collaborative research
Customized dynamic load balancing for a network of workstations

HPDC '96 Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing
Message-passing parallel adaptive quantum trajectory method

High performance scientific and engineering computing
A dynamic application-driven data communication strategy

Proceedings of the 18th annual international conference on Supercomputing
Optimizing Array-Intensive Applications for On-Chip Multiprocessors

IEEE Transactions on Parallel and Distributed Systems
Simulation of Vector Nonlinear Time Series Models on Clusters

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 13 - Volume 14
Shared memory multiprocessor support for functional array processing in SAC

Journal of Functional Programming
An Enhanced Parallel Loop Self-Scheduling Scheme for Cluster Environments

The Journal of Supercomputing
Design and implementation of a novel dynamic load balancing library for cluster computing

Parallel Computing - Heterogeneous computing
A Load Balancing Tool for Distributed Parallel Loops

Cluster Computing
Feedback guided dynamic loop scheduling: convergence of the continuous case

The Journal of Supercomputing - Special issue: Parallel and distributed processing and applications
A taxonomy of Data Grids for distributed data sharing, management, and processing

ACM Computing Surveys (CSUR)
PackageBLAST: an adaptive multi-policy grid service for biological sequence comparison

Proceedings of the 2006 ACM symposium on Applied computing
SPM Conscious Loop Scheduling for Embedded Chip Multiprocessors

ICPADS '06 Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 1
Large scale multiple sequence alignment with simultaneous phylogeny inference

Journal of Parallel and Distributed Computing
New Scheduling Strategies for Randomized Incremental Algorithms in the Context of Speculative Parallelization

IEEE Transactions on Computers
Memory bank aware dynamic loop scheduling

Proceedings of the conference on Design, automation and test in Europe
On development of an efficient parallel loop self-scheduling for grid computing environments

Parallel Computing
A performance-based parallel loop scheduling on grid environments

The Journal of Supercomputing
Enhancing self-scheduling algorithms via synchronization and weighting

Journal of Parallel and Distributed Computing
Dynamic partitioning of loop iterations on heterogeneous PC clusters

The Journal of Supercomputing
Dynamic load balancing with adaptive factoring methods in scientific applications

The Journal of Supercomputing
Performance evaluation of a dynamic load-balancing library for cluster computing

International Journal of Computational Science and Engineering
A practical application of FGDLS to birds flock trajectory

ICCOMP'05 Proceedings of the 9th WSEAS International Conference on Computers
Derivation of self-scheduling algorithms for heterogeneous distributed computer systems: Application to internet-based grids of computers

Future Generation Computer Systems
Chunking parallel loops in the presence of synchronization

Proceedings of the 23rd international conference on Supercomputing
Task distribution using factoring load balancing in Master--Worker applications

Information Processing Letters
Implementation of a Performance-Based Loop Scheduling on Heterogeneous Clusters

ICA3PP '09 Proceedings of the 9th International Conference on Algorithms and Architectures for Parallel Processing
A directive-based MPI code generator for Linux PC clusters

The Journal of Supercomputing
An adaptive multi-policy grid service for biological sequence comparison

Journal of Parallel and Distributed Computing
Structure-driven optimizations for amorphous data-parallel programs

Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Parallelism in a multi-user environment

Parallel Computing
A parallel loop self-scheduling on extremely heterogeneous PC clusters

ICCS'03 Proceedings of the 2003 international conference on Computational science
Performance-based workload distribution on grid environments

GPC'07 Proceedings of the 2nd international conference on Advances in grid and pervasive computing
Enhanced loop coalescing: a compiler technique for transforming non-uniform iteration spaces

ISHPC'05/ALPS'06 Proceedings of the 6th international symposium on high-performance computing and 1st international conference on Advanced low power systems
Performance-based loop scheduling on grid environments

ISHPC'05/ALPS'06 Proceedings of the 6th international symposium on high-performance computing and 1st international conference on Advanced low power systems
Brief announcement: locality-aware load balancing for speculatively-parallelized irregular applications

Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Adaptive statistical scheduling of divisible workloads in heterogeneous systems

Journal of Scheduling
Integration of Heterogeneous and Non-dedicated Environments for R

CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Ordered and unordered algorithms for parallel breadth first search

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Parallel inclusion-based points-to analysis

Proceedings of the ACM international conference on Object oriented programming systems languages and applications
Studying the impact of synchronization frequency on scheduling tasks with dependencies in heterogeneous systems

Performance Evaluation
Parallel programming with data driven model

EURO-PDP'00 Proceedings of the 8th Euromicro conference on Parallel and distributed processing
Parallel multiple sequence alignment with local phylogeny search by simulated annealing

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Simulation of a hybrid model for image denoising

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Dynamic multi phase scheduling for heterogeneous cluste

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Ordered vs. unordered: a comparison of parallelism and work-efficiency in irregular algorithms

Proceedings of the 16th ACM symposium on Principles and practice of parallel programming
A concurrent object-oriented approach to the eigenproblem treatment in shared memory multicore environments

ICCSA'11 Proceedings of the 2011 international conference on Computational science and its applications - Volume Part I
Exploiting thread-data affinity in OpenMP with data access patterns

Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
A parameter study of a hybrid Laplacian mean-curvature flow denoising model

The Journal of Supercomputing
Distributed dynamic load balancing for pipelined computations on heterogeneous systems

Parallel Computing
Load and performance balancing scheme for heterogeneous parallel processing

CIS'04 Proceedings of the First international conference on Computational and Information Science
An efficient approach for self-scheduling parallel loops on multiprogrammed parallel computers

LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
A new carried-dependence self-scheduling algorithm

ICCSA'05 Proceedings of the 2005 international conference on Computational Science and its Applications - Volume Part I
A performance-based parallel loop self-scheduling on grid computing environments

NPC'05 Proceedings of the 2005 IFIP international conference on Network and Parallel Computing
Scheduling divisible workloads using the adaptive time factoring algorithm

ICA3PP'05 Proceedings of the 6th international conference on Algorithms and Architectures for Parallel Processing
Convergence of the discrete FGDLS algorithm

HPCC'05 Proceedings of the First international conference on High Performance Computing and Communications
A hybrid parallel loop scheduling scheme on grid environments

GCC'05 Proceedings of the 4th international conference on Grid and Cooperative Computing
A dynamic partitioning self-scheduling scheme for parallel loops on heterogeneous clusters

ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part I
Performance-based parallel loop self-scheduling on heterogeneous multicore PC clusters

HPCA'09 Proceedings of the Second international conference on High Performance Computing and Applications
Scheduling bot applications in grids using a slave oriented adaptive algorithm

ISPA'04 Proceedings of the Second international conference on Parallel and Distributed Processing and Applications
Probablistic self-scheduling

Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Online task scheduling on heterogeneous clusters: an experimental study

PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
An adaptive job allocation strategy for heterogeneous multi-cluster systems

GPC'10 Proceedings of the 5th international conference on Advances in Grid and Pervasive Computing
A performance-based approach to dynamic workload distribution for master-slave applications on grid environments

GPC'06 Proceedings of the First international conference on Advances in Grid and Pervasive Computing
Dynamic load balancing with MatlabMPI

ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part II
Using hybrid MPI and OpenMP programming to optimize communications in parallel loop self-scheduling schemes for multicore PC clusters

The Journal of Supercomputing
Effective parallelization of loops in the presence of I/O operations

Proceedings of the 33rd ACM SIGPLAN conference on Programming Language Design and Implementation
Partitioning and scheduling loops on NOWs

Computer Communications
A self-adaptive computing framework for parallel maximum likelihood evaluation

The Journal of Supercomputing
Performance enhancement under power constraints using heterogeneous CMOS-TFET multicores

Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Performance evaluation of enhancement of the layered self-scheduling approach for heterogeneous multicore cluster systems

The Journal of Supercomputing
Accelerating MapReduce on a coupled CPU-GPU architecture

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Using analytical models to load balancing in a heterogeneous network of computers

PaCT'07 Proceedings of the 9th international conference on Parallel Computing Technologies
Distributing fixed time slices in heterogeneous networks of workstations (NOWs)

ISPA'07 Proceedings of the 5th international conference on Parallel and Distributed Processing and Applications
Towards the optimal synchronization granularity for dynamic scheduling of pipelined computations on heterogeneous computing systems

Concurrency and Computation: Practice & Experience
A dynamic self-scheduling scheme for heterogeneous multiprocessor architectures

ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Multiple biological sequence alignment in heterogeneous multicore clusters with user-selectable task allocation policies

The Journal of Supercomputing
A Transformation Framework for Optimizing Task-Parallel Programs

ACM Transactions on Programming Languages and Systems (TOPLAS)
Load balancing in a changing world: dealing with heterogeneity and performance variability

Proceedings of the ACM International Conference on Computing Frontiers
Bio-Cirrus: a framework for running legacy bioinformatics applications with cloud computing resources

IWANN'13 Proceedings of the 12th international conference on Artificial Neural Networks: advences in computational intelligence - Volume Part II

Quantified Score

Hi-index	15.01

Visualization

Abstract

This paper proposes guided self-scheduling, a new approach for scheduling arbitrarily nested parallel program loops on shared memory multiprocessor systems. Utilizing loop parallelism is clearly most crucial in achieving high system and program performance. Because of its simplicity, guided self-scheduling is particularly suited for implementation on real parallel machines. This method achieves simultaneously the two most important objectives: load balancing and very low synchronization overhead. For certain types of loops we show analytically that guided self-scheduling uses minimal overhead and achieves optimal schedules. Two other interesting properties of this method are its insensitivity to the initial processor configuration (in time) and its parameterized nature which allows us to tune it for different systems. Finally we discuss experimental results that clearly show the advantage of guided self-scheduling over the most widely known dynamic methods.