The Fail-Heterogeneous Architectural Model

Authors:
Marco Serafini;Neeraj Suri
Affiliations:
Technical University of Darmstadt, Germany;Technical University of Darmstadt, Germany
Venue:
SRDS '07 Proceedings of the 26th IEEE International Symposium on Reliable Distributed Systems
Year:
2007

Citing 0
Cited 2

Reducing the costs of large-scale BFT replication

LADIS '08 Proceedings of the 2nd Workshop on Large-Scale Distributed Systems and Middleware
Failure-aware resource management for high-availability computing clusters with distributed virtual machines

Journal of Parallel and Distributed Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Fault tolerant distributed protocols typically utilize a ho- mogeneous fault model, either fail-crash or fail-Byzantine, where all processors are assumed to fail in the same man- ner. In practice, due to complexity and evolvability rea- sons, only a subset of the nodes can actually be designed to have a restricted, fail-crash failure mode, provided that they are free of design faults. Based on this consideration, we propose a fail-heterogeneous architectural model for dis- tributed systems which considers two classes of nodes: (a) full-fledged execution nodes, which can be fail-Byzantine, and (b) lightweight, validated coordination nodes, which can only be fail-crash. To illustrate the model we intro- duce HeterTrust as a practical trustworthy service replica- tion protocol. It has a low latency overhead, requires few execution nodes with diversified design, and prevents in- truded servers from disclosing confidential data. We also discuss applications of the model to DoS attacks mitigation and to group membership.