Gamma system: continuous evolution of software after deployment
ISSTA '02 Proceedings of the 2002 ACM SIGSOFT international symposium on Software testing and analysis
CIL: Intermediate Language and Tools for Analysis and Transformation of C Programs
CC '02 Proceedings of the 11th International Conference on Compiler Construction
Bug isolation via remote program sampling
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
DART: directed automated random testing
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Detecting BGP configuration faults with static analysis
NSDI'05 Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation - Volume 2
Understanding and dealing with operator mistakes in internet services
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Detecting targeted attacks using shadow honeypots
SSYM'05 Proceedings of the 14th conference on USENIX Security Symposium - Volume 14
Shadow configuration as a network management primitive
Proceedings of the ACM SIGCOMM 2008 conference on Data communication
MODIST: transparent model checking of unmodified distributed systems
NSDI'09 Proceedings of the 6th USENIX symposium on Networked systems design and implementation
CrystalBall: predicting and preventing inconsistencies in deployed distributed systems
NSDI'09 Proceedings of the 6th USENIX symposium on Networked systems design and implementation
NetReview: detecting when interdomain routing goes wrong
NSDI'09 Proceedings of the 6th USENIX symposium on Networked systems design and implementation
KLEE: unassisted and automatic generation of high-coverage tests for complex systems programs
OSDI'08 Proceedings of the 8th USENIX conference on Operating systems design and implementation
Fault prediction in distributed systems gone wild
Proceedings of the 4th International Workshop on Large Scale Distributed Systems and Middleware
Online testing of federated and heterogeneous distributed systems
Proceedings of the ACM SIGCOMM 2011 conference
A SOFT way for openflow switch interoperability testing
Proceedings of the 8th international conference on Emerging networking experiments and technologies
Hi-index | 0.00 |
Making distributed systems reliable is notoriously difficult. It is even more difficult to achieve high reliability for federated and heterogeneous systems, i.e., those that are operated by multiple administrative entities and have numerous inter-operable implementations. A prime example of such a system is the Internet's inter-domain routing, today based on BGP. We argue that system reliability should be improved by proactively identifying potential faults using an online testing functionality. We propose DiCE, an approach that continuously and automatically explores the system behavior, to check whether the system deviates from its desired behavior. DiCE orchestrates the exploration of relevant system behaviors by subjecting system nodes to many possible inputs that exercise node actions. DiCE starts exploring from current, live system state, and operates in isolation from the deployed system. We describe our experience in integrating DiCE with an opensource BGP router. We evaluate the prototype's ability to quickly detect origin misconfiguration, a recurring operator mistake that causes Internet-wide outages. We also quantify DiCE's overhead and find it to have marginal impact on system performance.