Intrusion-tolerant replication under attack

Authors:
Yair Amir;Jonathan Kirsch
Affiliations:
The Johns Hopkins University;The Johns Hopkins University
Venue:
Intrusion-tolerant replication under attack
Year:
2010

Citing 0
Cited 1

Toward survivable SCADA

Proceedings of the Seventh Annual Workshop on Cyber Security and Information Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Much of our critical infrastructure is controlled by large software systems whose participants are distributed across the Internet. As our dependence on these critical systems continues to grow, it becomes increasingly important that they meet strict availability and performance requirements, even in the face of malicious attacks, including those that are successful in compromising parts of the system. This dissertation presents the first replication protocols capable of guaranteeing correctness, availability, and good performance even when some of the servers are compromised, enabling the construction of highly available and highly resilient systems for our critical infrastructure. Prior to this work, intrusion-tolerant replication protocols were designed to perform well in fault-free executions, and this is how they were evaluated. In this dissertation we point out that many state-of-the-art protocols are vulnerable to significant performance degradation by a small number of malicious processors. We define a new performance-oriented correctness criterion, BOUNDED-DELAY, against which intrusion-tolerant replication protocols can be evaluated. Protocols that meet BOUNDED-DELAY are required to provide a consistent level of performance, even when the system is under attack by an adversary that controls some of the processors. We present Prime, an intrusion-tolerant replication protocol that meets BOUNDED-DELAY and thus offers a stronger performance guarantee under attack than previous state-of-the-art protocols. An evaluation of a prototype implementation shows that Prime performs competitively with existing protocols in fault-free executions and achieves an order of magnitude performance improvement in under-attack executions in 4-server and 7-server configurations. Using Prime as a building block, we show how to design and implement an attack-resilient, large-scale intrusion-tolerant replication system for wide-area networks. The system is hierarchical and is suited to deployments consisting of several wide-area sites, each with a cluster of replication servers. We present three mechanisms for attack-resilient and efficient inter-site communication, which enable the system to perform well in bandwidth-constrained wide-area networks without making it susceptible to performance degradation caused by malicious servers. Our results provide evidence that it is possible to construct highly resilient, large-scale survivable systems that perform well even when some of the servers (and some entire sites) are compromised.