Approximate Analysis of Fork/Join Synchronization in Parallel Queues
IEEE Transactions on Computers
Performance Analysis of Parallel Processing Systems
IEEE Transactions on Software Engineering
Acyclic fork-join queuing networks
Journal of the ACM (JACM)
On the execution of parallel programs on multiprocessor systems—a queuing theory approach
Journal of the ACM (JACM)
Rearrangement, majorization and stochastic scheduling
Mathematics of Operations Research
Monotone structure in discrete-event systems
Monotone structure in discrete-event systems
Optimal task scheduling on distributed parallel processors
Performance '93 Proceedings of the 16th IFIP Working Group 7.3 international symposium on Computer performance modeling measurement and evaluation
Bounds on the speedup and efficiency of partial synchronization in parallel processing systems
Journal of the ACM (JACM)
Hi-index | 0.98 |
Consider a set of parallel processors operating in a distributed fashion, which prohibits task migration among processors. A given set of jobs, each of which consists of a number of tasks, is to be processed by the parallel processors. Finding an optimal schedule in this context is a difficult combinatorial problem in general. Our focus here is on identifying key properties of the problem so as to provide insight to the structure of optimal policies. We demonstrate that these properties are all rooted in the subadditivity, submodularity, convexity, and Schur convexity of two operators, max and plus, which relate task times to the flow time (expected job completion time). We show that in general the optimal policy has a threshold structure and a sequential ''tail'': there exists a threshold, such that once the number of jobs already scheduled exceeds this threshold, all remaining jobs must be scheduled sequentially, i.e., each job with the entirety of its tasks is assigned to one processor only. In the special case of two processors, we further develop a recursive algorithm that generates the complete optimal schedule. For the case of three or more processors, we focus on a class of policies that we call ''fully parallel or fully sequential'' (FPFS), and identify optimal policies within this class.