Scheduling parallel program tasks onto arbitrary target machines
Journal of Parallel and Distributed Computing - Special issue: software tools for parallel programming and visualization
Towards an architecture-independent analysis of parallel algorithms
SIAM Journal on Computing
Task Allocation for Maximizing Reliability of Distributed Computer Systems
IEEE Transactions on Computers
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
A two-pass scheduling algorithm for parallel programs
Parallel Computing
IEEE Transactions on Parallel and Distributed Systems
Task Allocation Algorithms for Maximizing Reliability of Distributed Computing Systems
IEEE Transactions on Computers
Optimal Scheduling Algorithm for Distributed-Memory Machines
IEEE Transactions on Parallel and Distributed Systems
A Fault-Tolerant Dynamic Scheduling Algorithm for Multiprocessor Real-Time Systems and Its Analysis
IEEE Transactions on Parallel and Distributed Systems
On Exploiting Task Duplication in Parallel Program Scheduling
IEEE Transactions on Parallel and Distributed Systems
A comparison of list schedules for parallel processing systems
Communications of the ACM
Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Low-Cost Task Scheduling for Distributed-Memory Machines
IEEE Transactions on Parallel and Distributed Systems
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
Hypertool: A Programming Aid for Message-Passing Systems
IEEE Transactions on Parallel and Distributed Systems
Fast Allocation of Processes in Distributed and Parallel Systems
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Journal of Parallel and Distributed Computing - Problems in parallel and distributed computing: Solutions based on evolutionary paradigms
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Experimental Assessment of Workstation Failures and Their Impact on Checkpointing Systems
FTCS '98 Proceedings of the The Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing
ICPP '00 Proceedings of the 2000 International Workshop on Parallel Processing
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Journal of Parallel and Distributed Computing
Iterative list scheduling for heterogeneous computing
Journal of Parallel and Distributed Computing
Journal of Parallel and Distributed Computing
Toward a Realistic Task Scheduling Model
IEEE Transactions on Parallel and Distributed Systems
Parallel Computing - Heterogeneous computing
A Task Allocation Model for Distributed Computing Systems
IEEE Transactions on Computers
Multiprocessor Scheduling with the Aid of Network Flow Algorithms
IEEE Transactions on Software Engineering
Journal of Parallel and Distributed Computing
Journal of Parallel and Distributed Computing
List scheduling with duplication for heterogeneous computing systems
Journal of Parallel and Distributed Computing
A stochastic scheduling algorithm for precedence constrained tasks on Grid
Future Generation Computer Systems
A hierarchical reliability-driven scheduling algorithm in grid systems
Journal of Parallel and Distributed Computing
Energy- and reliability-aware task scheduling onto heterogeneous MPSoC architectures
The Journal of Supercomputing
Reliable workflow scheduling with less resource redundancy
Parallel Computing
Hi-index | 0.00 |
Heterogeneous computing systems are promising computing platforms, since single parallel architecture based systems may not be sufficient to exploit the available parallelism with the running applications. In some cases, heterogeneous distributed computing (HDC) systems can achieve higher performance with lower cost than single-machine supersystems. However, in HDC systems, processors and networks are not failure free and any kind of failure may be critical to the running applications. One way of dealing with such failures is to employ a reliable scheduling algorithm. Unfortunately, most existing scheduling algorithms for precedence constrained tasks in HDC systems do not adequately consider reliability requirements of inter-dependent tasks. In this paper, we design a reliability-driven scheduling architecture that can effectively measure system reliability, based on an optimal reliability communication path search algorithm, and then we introduce reliability priority rank (RRank) to estimate the task's priority by considering reliability overheads. Furthermore, based on directed acyclic graph (DAG) we propose a reliability-aware scheduling algorithm for precedence constrained tasks, which can achieve high quality of reliability for applications. The comparison studies, based on both randomly generated graphs and the graphs of some real applications, show that our scheduling algorithm outperforms the existing scheduling algorithms in terms of makespan, scheduling length ratio, and reliability. At the same time, the improvement gained by our algorithm increases as the data communication among tasks increases.