Leveraging SDN layering to systematically troubleshoot networks

  • Authors:
  • Brandon Heller;Colin Scott;Nick McKeown;Scott Shenker;Andreas Wundsam;Hongyi Zeng;Sam Whitlock;Vimalkumar Jeyakumar;Nikhil Handigol;James McCauley;Kyriakos Zarifis;Peyman Kazemian

  • Affiliations:
  • Stanford University, Stanford, CA, USA;UC Berkeley, Berkeley, CA, USA;Stanford University, Stanford, CA, USA;International Computer Science Institute & UC Berkeley, Berkeley, CA, USA;Big Switch Networks & International Computer Science Institute, Mountain View, Berkeley, CA, USA;Stanford University, Stanford, CA, USA;International Computer Science Institute, Berkeley, CA, USA;Stanford University, Stanford, CA, USA;Stanford University, Stanford, CA, USA;UC Berkeley, Berkeley, CA, USA;University of Southern California, Los Angeles, CA, USA;Stanford University, Stanford, CA, USA

  • Venue:
  • Proceedings of the second ACM SIGCOMM workshop on Hot topics in software defined networking
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Today's networks are maintained by "masters of complexity": network admins who have accumulated the wisdom to troubleshoot complex problems, despite a limiting toolset. This position paper advocates a more structured troubleshooting approach that leverages architectural layering in Software-Defined Networks (SDNs). In all networks, high-level intent (policy) must correctly map to low-level forwarding behavior (hardware configuration). In SDNs, intent is explicitly expressed, forwarding semantics are explicitly defined, and each architectural layer fully specifies the behavior of the network. Building on these observations, we show how recently-developed troubleshooting tools fit into a coherent workflow that detects mistranslations between layers to precisely localize sources of errant control logic. Our goals are to explain the overall picture, show how the pieces fit together to enable a systematic workflow, and highlight the questions that remain. Once this workflow is realized, network admins can formally verify that their network is operating correctly, automatically troubleshoot bugs, and systematically track down their root cause -- freeing admins to fix problems, rather than diagnose their symptoms.