Proactive Fortification of Fault-Tolerant Services
OPODIS '09 Proceedings of the 13th International Conference on Principles of Distributed Systems
Hi-index | 0.00 |
The common approach to detecting anomaly-based intrusion is by replicating the computation and running a Byzantine agreement protocol among all replicas. However, Byzantine agreement incurs high communication overhead and also requires the use of more than 2t replicas in order to overcome t such failures. However, for many applications, and in particular scientific computation, it is possible to achieve the same goal with much lower average communication and replication overheads. This paper presents a new approach for detecting an intrusion by combining checkpoint/restart with replication. The main benefit of the approach is that we replicate the execution into only t + 1 replicas, and invoke a Byzantine agreement only if we suspect an anomalous behavior that could be observed using checkpointing techniques. If a failure occurs, it is detected using any Byzantine agreement protocol that can agree on a recent valid systemýs state. Such a Byzantine agreement protocol also identifies the compromised nodes and eliminates them, so the computation can proceed with only t+1 replicas until the next failure occurs.