A fault-tolerant scheduling problem
IEEE Transactions on Software Engineering
IEEE Transactions on Parallel and Distributed Systems
Efficient scheduling of arbitrary task graphs to multiprocessors using a parallel genetic algorithm
Journal of Parallel and Distributed Computing - Special issue on parallel evolutionary computing
In search of clusters (2nd ed.)
In search of clusters (2nd ed.)
On Exploiting Task Duplication in Parallel Program Scheduling
IEEE Transactions on Parallel and Distributed Systems
Core Jini
Solving Linear Algebraic Equations on an MIMD Computer
Journal of the ACM (JACM)
Benchmarking and comparison of the task graph scheduling algorithms
Journal of Parallel and Distributed Computing
Static scheduling algorithms for allocating directed task graphs to multiprocessors
ACM Computing Surveys (CSUR)
Scalable Parallel Computing: Technology,Architecture,Programming
Scalable Parallel Computing: Technology,Architecture,Programming
CASCH: A Tool for Computer-Aided Scheduling
IEEE Concurrency
Hypertool: A Programming Aid for Message-Passing Systems
IEEE Transactions on Parallel and Distributed Systems
DSC: Scheduling Parallel Tasks on an Unbounded Number of Processors
IEEE Transactions on Parallel and Distributed Systems
A Fuzzy Approach to Load Balancing in a Distributed Object Computing Network
CCGRID '01 Proceedings of the 1st International Symposium on Cluster Computing and the Grid
A Parallel Algorithm For Compile-Time Scheduling Of Parallel Programs On Multiprocessors
PACT '97 Proceedings of the 1997 International Conference on Parallel Architectures and Compilation Techniques
Task graph pre-scheduling, using Nash equilibrium in game theory
The Journal of Supercomputing
Hi-index | 0.00 |
We propose a new approach, called cluster-based search (CBS), for scheduling large task graphs in parallel on a heterogeneous cluster of workstations connected by a high-speed network (e.g., using an ATM switch at OC-3 speed). The CBS algorithm uses a parallel random neighborhood search which works by refining multiple different initial schedules simultaneously using different workstations. The workstations communicate periodically to exchange their best solutions found thus far in order to direct the search to more promising regions in the search space. Heterogeneity of machines is exploited by the biased partitioning of the search space. The parallel random neighborhood search is fault-tolerant in that the workload of a failed workstation is automatically redistributed to other workstations so that the search can continue. We have implemented the CBS algorithm as a core function of our on-going development of SSI middleware for a Sun workstation cluster.