Distributed workflow mapping algorithm for maximized reliability under end-to-end delay constraint
The Journal of Supercomputing
Hi-index | 0.00 |
Many large-scale scientific applications feature distributed computing workflows of complex structures that must be executed and transferred in shared wide-area networks consisting of unreliable nodes and links. Mapping these computing workflows in such faulty network environments for optimal latency while ensuring certain fault tolerance is crucial to the success of eScience that requires both performance and reliability. We construct analytical cost models and formulate workflow mapping as an optimization problem under failure rate constraint. We propose a distributed heuristic mapping solution based on recursive critical path to achieve minimum end-to-end delay and satisfy a pre-specified overall failure rate for a guaranteed level of fault tolerance. The performance superiority of the proposed mapping solution is illustrated by extensive simulation-based comparisons with existing mapping algorithms.