Guided self-scheduling: A practical scheduling scheme for parallel supercomputers
IEEE Transactions on Computers
The limited performance benefits of migrating active processes for load sharing
SIGMETRICS '88 Proceedings of the 1988 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Acyclic fork-join queuing networks
Journal of the ACM (JACM)
Determining average program execution times and their variance
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
An Analysis of Scatter Decomposition
IEEE Transactions on Computers
Processor scheduling in shared memory multiprocessors
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Dynamic Processor Self-Scheduling for General Parallel Nested Loops
IEEE Transactions on Computers
Asynchronous Disk Interleaving: Approximating Access Delays
IEEE Transactions on Computers
Factoring: a practical and robust method for scheduling parallel loops
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Factoring: a method for scheduling parallel loops
Communications of the ACM
Low-overhead scheduling of nested parallelism
IBM Journal of Research and Development
Automatic partitioning of a program dependence graph into parallel tasks
IBM Journal of Research and Development
A dynamic scheduling method for irregular parallel programs
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Scalability analysis of partitioning strategies for finite element graphs: a summary of results
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Using processor affinity in loop scheduling on shared-memory multiprocessors
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Orchestrating interactions among parallel computations
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Managing pages in shared virtual memory systems: getting the compiler into the game
ICS '93 Proceedings of the 7th international conference on Supercomputing
The influence of random delays on parallel execution times
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Asynchronous analysis of parallel dynamic programming
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Combining static and dynamic scheduling on distributed-memory multiprocessors
ICS '94 Proceedings of the 8th international conference on Supercomputing
Cost/performance of a parallel computer simulator
PADS '94 Proceedings of the eighth workshop on Parallel and distributed simulation
A Hierarchical Task Queue Organization for Shared-Memory Multiprocessor Systems
IEEE Transactions on Parallel and Distributed Systems
Asynchronous Analysis of Parallel Dynamic Programming Algorithms
IEEE Transactions on Parallel and Distributed Systems
Symbolic analysis for parallelizing compilers
ACM Transactions on Programming Languages and Systems (TOPLAS)
Load-sharing in heterogeneous systems via weighted factoring
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Static Assignment of Stochastic Tasks Using Majorization
IEEE Transactions on Computers
Impact of Memory Contention on Dynamic Scheduling on NUMA Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Modeling cost/performance of a parallel computer simulator
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Dynamic scheduling with incomplete information
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Performance prediction based loop scheduling for heterogeneous computing environment
SAC '97 Proceedings of the 1997 ACM symposium on Applied computing
Performance analysis for parallel solutions to generic search problems
SAC '97 Proceedings of the 1997 ACM symposium on Applied computing
Analyzing the expected execution times of parallel programs
SAC '97 Proceedings of the 1997 ACM symposium on Applied computing
Static performance prediction of data-dependent programs
Proceedings of the 2nd international workshop on Software and performance
Performance Metrics for Embedded Parallel Pipelines
IEEE Transactions on Parallel and Distributed Systems
PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
Language and Compiler Support for Adaptive Distributed Applications
OM '01 Proceedings of the 2001 ACM SIGPLAN workshop on Optimization of middleware and distributed systems
A Hybrid Solution of Fork/Join Synchronization in Parallel Queues
IEEE Transactions on Parallel and Distributed Systems
Affinity scheduling of unbalanced workloads
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Beyond Execution Time: Expanding the Use of Performance Models
IEEE Parallel & Distributed Technology: Systems & Technology
Stochastic Bounds for Parallel Program Execution Times with Processor Constraints
IEEE Transactions on Computers
Effectiveness of Parallel Joins
IEEE Transactions on Knowledge and Data Engineering
Declustering and Load-Balancing Methods for Parallelizing Geographic Information Systems
IEEE Transactions on Knowledge and Data Engineering
The Effect of Scheduling Discipline on Spin Overhead in Shared Memory Parallel Systems
IEEE Transactions on Parallel and Distributed Systems
Synchronization and Communication Costs of Loop Partitioning on Shared-Memory Multiprocessor Systems
IEEE Transactions on Parallel and Distributed Systems
Loop Coalescing and Scheduling for Barrier MIMD Architectures
IEEE Transactions on Parallel and Distributed Systems
Performance Analysis and Scheduling of Stochastic Fork-Join Jobs in a Multicomputer System
IEEE Transactions on Parallel and Distributed Systems
Using Processor Affinity in Loop Scheduling on Shared-Memory Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Computing Performance Bounds of Fork-Join Parallel Programs Under a Multiprocessing Environment
IEEE Transactions on Parallel and Distributed Systems
Dynamic Scheduling Parallel Loops with Variable Iterate Execution Times
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Load Balancing Highly Irregular Computations with the Adaptive Factoring
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Performance of Scheduling Scientific Applications with Adaptive Weighted Factoring
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
A Theoretical Application of Feedback Guided Dynamic Loop Scheduling
IWCC '01 Proceedings of the NATO Advanced Research Workshop on Advanced Environments, Tools, and Applications for Cluster Computing-Revised Papers
Performance Prediction of Data-Dependent Task Parallel Programs
Euro-Par '01 Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing
A Semi-dynamic Multiprocessor Scheduling Algorithm with an Asymptotically Optimal Competitive Ratio
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Symbolic Performance Prediction of Data-Dependent Parallel Programs
TOOLS '02 Proceedings of the 12th International Conference on Computer Performance Evaluation, Modelling Techniques and Tools
LCR '98 Selected Papers from the 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers
Scheduling at Twilight the Easy Way
STACS '02 Proceedings of the 19th Annual Symposium on Theoretical Aspects of Computer Science
Adaptive Computing on the Grid Using AppLeS
IEEE Transactions on Parallel and Distributed Systems
Automatic parallelization for symmetric shared-memory multiprocessors
CASCON '96 Proceedings of the 1996 conference of the Centre for Advanced Studies on Collaborative research
Customized dynamic load balancing for a network of workstations
HPDC '96 Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing
Parallel program performance prediction using deterministic task graph analysis
ACM Transactions on Computer Systems (TOCS)
Message-passing parallel adaptive quantum trajectory method
High performance scientific and engineering computing
Simulation of Vector Nonlinear Time Series Models on Clusters
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 13 - Volume 14
A novel approach for partitioning iteration spaces with variable densities
Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming
Shared memory multiprocessor support for functional array processing in SAC
Journal of Functional Programming
Low-Cost Static Performance Prediction of Parallel Stochastic Task Compositions
IEEE Transactions on Parallel and Distributed Systems
Design and implementation of a novel dynamic load balancing library for cluster computing
Parallel Computing - Heterogeneous computing
A Load Balancing Tool for Distributed Parallel Loops
Cluster Computing
Advances, Applications and Performance of the Global Arrays Shared Memory Programming Toolkit
International Journal of High Performance Computing Applications
Unstructured peer-to-peer networks for sharing processor cycles
Parallel Computing - Parallel matrix algorithms and applications (PMAA'04)
A general approach for partitioning N-dimensional parallel nested loops with conditionals
Proceedings of the eighteenth annual ACM symposium on Parallelism in algorithms and architectures
Modeling master/worker applications for automatic performance tuning
Parallel Computing - Algorithmic skeletons
Tight analysis of the performance potential of thread speculation using spec CPU 2006
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Using the GA and TAO toolkits for solving large-scale optimization problems on parallel computers
ACM Transactions on Mathematical Software (TOMS)
IEEE Transactions on Computers
A performance-based parallel loop scheduling on grid environments
The Journal of Supercomputing
Enhancing self-scheduling algorithms via synchronization and weighting
Journal of Parallel and Distributed Computing
Dynamic load balancing with adaptive factoring methods in scientific applications
The Journal of Supercomputing
Performance evaluation of a dynamic load-balancing library for cluster computing
International Journal of Computational Science and Engineering
A practical application of FGDLS to birds flock trajectory
ICCOMP'05 Proceedings of the 9th WSEAS International Conference on Computers
Performance modeling and analysis of correlated parallel computations
Parallel Computing
Future Generation Computer Systems
Chunking parallel loops in the presence of synchronization
Proceedings of the 23rd international conference on Supercomputing
Task distribution using factoring load balancing in Master--Worker applications
Information Processing Letters
A directive-based MPI code generator for Linux PC clusters
The Journal of Supercomputing
An adaptive multi-policy grid service for biological sequence comparison
Journal of Parallel and Distributed Computing
Performance-based workload distribution on grid environments
GPC'07 Proceedings of the 2nd international conference on Advances in grid and pervasive computing
Enhanced loop coalescing: a compiler technique for transforming non-uniform iteration spaces
ISHPC'05/ALPS'06 Proceedings of the 6th international symposium on high-performance computing and 1st international conference on Advanced low power systems
Adaptive statistical scheduling of divisible workloads in heterogeneous systems
Journal of Scheduling
Integration of Heterogeneous and Non-dedicated Environments for R
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Simulation of a hybrid model for image denoising
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Dynamic multi phase scheduling for heterogeneous cluste
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
A parameter study of a hybrid Laplacian mean-curvature flow denoising model
The Journal of Supercomputing
Load and performance balancing scheme for heterogeneous parallel processing
CIS'04 Proceedings of the First international conference on Computational and Information Science
An efficient approach for self-scheduling parallel loops on multiprogrammed parallel computers
LCPC'05 Proceedings of the 18th international conference on Languages and Compilers for Parallel Computing
A new carried-dependence self-scheduling algorithm
ICCSA'05 Proceedings of the 2005 international conference on Computational Science and its Applications - Volume Part I
Scheduling divisible workloads using the adaptive time factoring algorithm
ICA3PP'05 Proceedings of the 6th international conference on Algorithms and Architectures for Parallel Processing
A geometric approach for partitioning n-dimensional non-rectangular iteration spaces
LCPC'04 Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
Online task scheduling on heterogeneous clusters: an experimental study
PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
A-FAST: autonomous flow approach to scheduling tasks
HiPC'04 Proceedings of the 11th international conference on High Performance Computing
A self-adaptive computing framework for parallel maximum likelihood evaluation
The Journal of Supercomputing
EG PGV'04 Proceedings of the 5th Eurographics conference on Parallel Graphics and Visualization
A flexible general-purpose parallelizing architecture for nested loops in reconfigurable platforms
PATMOS'07 Proceedings of the 17th international conference on Integrated Circuit and System Design: power and timing modeling, optimization and simulation
Concurrency and Computation: Practice & Experience
A Transformation Framework for Optimizing Task-Parallel Programs
ACM Transactions on Programming Languages and Systems (TOPLAS)
Load balancing in a changing world: dealing with heterogeneity and performance variability
Proceedings of the ACM International Conference on Computing Frontiers
Hi-index | 0.04 |
When using MIMD (multiple instruction, multiple data) parallel computers, one is often confronted with solving a task composed of many independent subtasks where it is necessary to synchronize the processors after all the subtasks have been completed. This paper studies how the subtasks should be allocated to the processors in order to minimize the expected time it takes to finish all the subtasks (sometimes called the makespan). We assume that the running times of the subtasks are independent, identically distributed, increasing failure rate random variables, and that assigning one or more subtasks to a processor entails some overhead, or communication time, that is independent of the number of subtasks allocated. Our analyses, which use ideas from renewal theory, reliability theory, order statistics, and the theory of large deviations, are valid for a wide class of distributions. We show that allocating an equal number of subtasks to each processor all at once has good efficiency. This appears as a consequence of a rather general theorem which shows how some consequences of the central limit theorem hold even when we cannot prove that the central limit theorem applies.