Distributed snapshots: determining global states of distributed systems
ACM Transactions on Computer Systems (TOCS)
A knowledge plane for the internet
Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications
Treating bugs as allergies: a safe method for surviving software failures
HOTOS'05 Proceedings of the 10th conference on Hot Topics in Operating Systems - Volume 10
NOX: towards an operating system for networks
ACM SIGCOMM Computer Communication Review
CrystalBall: predicting and preventing inconsistencies in deployed distributed systems
NSDI'09 Proceedings of the 6th USENIX symposium on Networked systems design and implementation
Onix: a distributed control platform for large-scale production networks
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
OFRewind: enabling record and replay troubleshooting for networks
USENIXATC'11 Proceedings of the 2011 USENIX conference on USENIX annual technical conference
Debugging the data plane with anteater
Proceedings of the ACM SIGCOMM 2011 conference
Leveraging existing instrumentation to automatically infer invariant-constrained models
Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering
Frenetic: a network programming language
Proceedings of the 16th ACM SIGPLAN international conference on Functional programming
Header space analysis: static checking for networks
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
A NICE way to test openflow applications
NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation
Where is the debugger for my software-defined network?
Proceedings of the first workshop on Hot topics in software defined networks
Automatic test packet generation
Proceedings of the 8th international conference on Emerging networking experiments and technologies
A SOFT way for openflow switch interoperability testing
Proceedings of the 8th international conference on Emerging networking experiments and technologies
Composing software-defined networks
nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
VeriFlow: verifying network-wide invariants in real time
nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
Real time network policy checking using header space analysis
nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
Unifying Cloud and Carrier Network: EU FP7 Project UNIFY
UCC '13 Proceedings of the 2013 IEEE/ACM 6th International Conference on Utility and Cloud Computing
Libra: divide and conquer to verify forwarding tables in huge networks
NSDI'14 Proceedings of the 11th USENIX Conference on Networked Systems Design and Implementation
Hi-index | 0.00 |
Today's networks are maintained by "masters of complexity": network admins who have accumulated the wisdom to troubleshoot complex problems, despite a limiting toolset. This position paper advocates a more structured troubleshooting approach that leverages architectural layering in Software-Defined Networks (SDNs). In all networks, high-level intent (policy) must correctly map to low-level forwarding behavior (hardware configuration). In SDNs, intent is explicitly expressed, forwarding semantics are explicitly defined, and each architectural layer fully specifies the behavior of the network. Building on these observations, we show how recently-developed troubleshooting tools fit into a coherent workflow that detects mistranslations between layers to precisely localize sources of errant control logic. Our goals are to explain the overall picture, show how the pieces fit together to enable a systematic workflow, and highlight the questions that remain. Once this workflow is realized, network admins can formally verify that their network is operating correctly, automatically troubleshoot bugs, and systematically track down their root cause -- freeing admins to fix problems, rather than diagnose their symptoms.