ACM Transactions on Programming Languages and Systems (TOPLAS)
Distributed Algorithms
Measuring ISP topologies with rocketfuel
Proceedings of the 2002 conference on Applications, technologies, architectures, and protocols for computer communications
Route flap damping exacerbates internet routing convergence
Proceedings of the 2002 conference on Applications, technologies, architectures, and protocols for computer communications
The Time Warp Mechanism for Database Concurrency Control
Proceedings of the Second International Conference on Data Engineering
ReVirt: enabling intrusion analysis through virtual-machine logging and replay
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
An integrated experimental environment for distributed systems and networks
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Speculative execution in a distributed file system
Proceedings of the twentieth ACM symposium on Operating systems principles
CADRE: Cycle-Accurate Deterministic Replay for Hardware Debugging
DSN '06 Proceedings of the International Conference on Dependable Systems and Networks
AVIO: detecting atomicity violations via access interleaving invariants
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Debugging operating systems with time-traveling virtual machines
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
Flashback: a lightweight extension for rollback and deterministic replay for software debugging
ATEC '04 Proceedings of the annual conference on USENIX Annual Technical Conference
Understanding data lifetime via whole system simulation
SSYM'04 Proceedings of the 13th conference on USENIX Security Symposium - Volume 13
Pip: detecting the unexpected in distributed systems
NSDI'06 Proceedings of the 3rd conference on Networked Systems Design & Implementation - Volume 3
Clairvoyant: a comprehensive source-level debugger for wireless sensor networks
Proceedings of the 5th international conference on Embedded networked sensor systems
Kendo: efficient deterministic multithreading in software
Proceedings of the 14th international conference on Architectural support for programming languages and operating systems
A type and effect system for deterministic parallel Java
Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications
A few billion lines of code later: using static analysis to find bugs in the real world
Communications of the ACM
CoreDet: a compiler and runtime system for deterministic multithreaded execution
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Respec: efficient online multiprocessor replayvia speculation and external determinism
Proceedings of the fifteenth edition of ASPLOS on Architectural support for programming languages and operating systems
Focus replay debugging effort on the control plane
HotDep'10 Proceedings of the Sixth international conference on Hot topics in system dependability
Deterministic process groups in dOS
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Efficient system-enforced deterministic parallelism
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
Stable deterministic multithreading through schedule memoization
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
WiDS checker: combating bugs in distributed systems
NSDI'07 Proceedings of the 4th USENIX conference on Networked systems design & implementation
Friday: global comprehension for distributed replay
NSDI'07 Proceedings of the 4th USENIX conference on Networked systems design & implementation
OFRewind: enabling record and replay troubleshooting for networks
USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
Dthreads: efficient deterministic multithreading
SOSP '11 Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
Where is the debugger for my software-defined network?
Proceedings of the first workshop on Hot topics in software defined networks
DDOS: taming nondeterminism in distributed systems
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
Hi-index | 0.00 |
Large-scale networks are among the most complex software infrastructures in existence. Unfortunately, the extreme complexity of their basis, the control-plane software, leads to a rich variety of nondeterministic failure modes and anomalies. Research on debugging modern control-plane software has focused on designing comprehensive record and replay systems, but the large volumes of recordings often hinder the scalability of these designs. Here, we argue for a different approach. Namely, we take the position that deterministic network execution would vastly simplify the control-plane debugging process. This paper presents the design and implementation of DEFINED, a user-space substrate for interactive debugging that provides deterministic execution of networks in highly distributed and dynamic environments. We demonstrate our system's advantages by reproducing discovery of known ordering and timing bugs in popular software routing platforms, XORP and Quagga. Using Rocketfuel topologies and routing data from a Tier-1 backbone, we show DEFINED is practical and scalable for interactive fault diagnosis in large networks.