A New Survival Architecture for Network Processors
AISA '02 Proceedings of the First International Workshop on Advanced Internet Services and Applications
Hi-index | 0.00 |
Abstract: This paper addresses fault tolerance in the WebCom metacomputer. WebCom's computation platform is dynamically reconfigurable and volunteer-based. Since its constituent machines may join and leave unpredictability, fault survival and efficient fault recovery is of paramount importance. A fault tolerance mechanism is outlined, which relies on a fast and efficient processor replacement procedure. It is shown that the characteristics of this procedure, together with the hierarchical and referentially transparent nature of WebCom executions, can be used to limit the affect of a fault to its immediate neighbourhood.