IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
Experimental Assessment of Workstation Failures and Their Impact on Checkpointing Systems
FTCS '98 Proceedings of the The Twenty-Eighth Annual International Symposium on Fault-Tolerant Computing
Task Scheduling Algorithms for Heterogeneous Processors
HCW '99 Proceedings of the Eighth Heterogeneous Computing Workshop
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
Triplet: A Clustering Scheduling Algorithm for Heterogeneous Systems
ICPPW '01 Proceedings of the 2001 International Conference on Parallel Processing Workshops
Adaptive Visualization Pipeline Decomposition and Mapping onto Computer Networks
ICIG '04 Proceedings of the Third International Conference on Image and Graphics
SBAC-PAD '05 Proceedings of the 17th International Symposium on Computer Architecture on High Performance Computing
Bi-objective scheduling algorithms for optimizing makespan and reliability on heterogeneous systems
Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures
Reliability and Scheduling on Systems Subject to Failures
ICPP '07 Proceedings of the 2007 International Conference on Parallel Processing
Performance evaluation of virtual machine-based Grid workflow system
Concurrency and Computation: Practice & Experience - 2nd International Workshop on Workflow Management and Applications in Grid Environments (WaGe2007)
Supporting Distributed Application Workflows in Heterogeneous Computing Environments
ICPADS '08 Proceedings of the 2008 14th IEEE International Conference on Parallel and Distributed Systems
Reliability in grid computing systems
Concurrency and Computation: Practice & Experience - A Special Issue from the Open Grid Forum
IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews
Performance analysis of dynamic workflow scheduling in multicluster grids
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
ICPADS '10 Proceedings of the 2010 IEEE 16th International Conference on Parallel and Distributed Systems
Virtual workflow system for distributed collaborative scientific applications on Grids
Computers and Electrical Engineering
Future Generation Computer Systems
Proceedings of the 44th Annual Simulation Symposium
Developing an End-to-End Scientific Workflow
Computing in Science and Engineering
Hi-index | 0.00 |
A distributed scientific workflow mapping algorithm for maximized reliability under certain end-to-end delay (EED) bound is proposed. It is studied in a heterogeneous distributed computing environment, where computing node and communication link failures are inevitable. The mapping decision and the stored table information is distributed among various nodes in order to achieve scalability and robustness, which are especially important for large-scale distributed systems. This Distributed Reliability Maximization workflow mapping algorithm under End-to-end Delay constraint (dis-DRMED) considers both the maximum reliability and the minimum EED objectives in a two-step procedure. In the first step, a mapping algorithm combining iterative Critical Path search and Layer-based priority assigning techniques (CPL) is adopted to minimize the EED by focusing on the optimal allocation of tasks on the critical path. In the second step, tasks on noncritical paths are remapped to improve the overall execution reliability. Simulation results under various system setups demonstrated that dis-DRMED achieved considerably higher reliability values under the same EED constraint compared with some representative workflow mapping algorithms.