Identifying performance bottlenecks in CDNs through TCP-level monitoring

  • Authors:
  • Peng Sun;Minlan Yu;Michael J. Freedman;Jennifer Rexford

  • Affiliations:
  • Princeton University, Princeton, NJ, USA;Princeton University, Princeton, NJ, USA;Princeton University, Princeton, NJ, USA;Princeton University, Princeton, NJ, USA

  • Venue:
  • Proceedings of the first ACM SIGCOMM workshop on Measurements up the stack
  • Year:
  • 2011

Quantified Score

Hi-index 0.02

Visualization

Abstract

Content distribution networks (CDNs) need to make decisions, such as server selection and routing, to improve performance for their clients. The performance may be limited by various factors such as packet loss in the network, a small receive buffer at the client, or constrained server CPU and disk resources. Conventional measurement techniques are not effective for distinguishing these performance problems: application-layer logs are too coarse-grained, while network-level traces are too expensive to collect all the time. We argue that passively monitoring the transport-level statistics in the server's network stack is a better approach. This paper presents a tool for monitoring and analyzing TCP statistics, and an analysis of a CoralCDN node in PlanetLab for six weeks. Our analysis shows that more than 10% of connections are server-limited at least 40% of the time, and many connections are limited by the congestion window despite no packet loss. Still, we see that clients in 377 Autonomous Systems (ASes) experience persistent packet loss. By separating network congestion from other performance problems, our analysis provides a much more accurate view of the performance of the network paths than what is possible with server logs alone.