DeTail: reducing the flow completion time tail in datacenter networks

Authors:
David Zats;Tathagata Das;Prashanth Mohan;Dhruba Borthakur;Randy Katz
Affiliations:
University of California, Berkeley, Berkeley, USA;University of California, Berkeley, Berkeley, USA;University of California, Berkeley, Berkeley, USA;Facebook, Menlo Park, USA;University of California, Berkeley, Berkeley, USA
Venue:
Proceedings of the ACM SIGCOMM 2012 conference on Applications, technologies, architectures, and protocols for computer communication
Year:
2012

Citing 20
Cited 15

The design philosophy of the DARPA internet protocols

SIGCOMM '88 Symposium proceedings on Communications architectures and protocols
Analysis and simulation of a fair queueing algorithm

SIGCOMM '89 Symposium proceedings on Communications architectures & protocols
Random early detection gateways for congestion avoidance

IEEE/ACM Transactions on Networking (TON)
TCP Vegas: new techniques for congestion detection and avoidance

SIGCOMM '94 Proceedings of the conference on Communications architectures, protocols and applications
The click modular router

ACM Transactions on Computer Systems (TOCS)
End-to-end arguments in system design

ACM Transactions on Computer Systems (TOCS)
Hot-Spot Avoidance With Multi-Pathing Over InfiniBand: An MPI Perspective

CCGRID '07 Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid
Online Experiments: Lessons Learned

Computer
A scalable, commodity data center network architecture

Proceedings of the ACM SIGCOMM 2008 conference on Data communication
Dcell: a scalable and fault-tolerant network structure for data centers

Proceedings of the ACM SIGCOMM 2008 conference on Data communication
VL2: a scalable and flexible data center network

Proceedings of the ACM SIGCOMM 2009 conference on Data communication
BCube: a high performance, server-centric network architecture for modular data centers

Proceedings of the ACM SIGCOMM 2009 conference on Data communication
Safe and effective fine-grained TCP retransmissions for datacenter communication

Proceedings of the ACM SIGCOMM 2009 conference on Data communication
Understanding TCP incast throughput collapse in datacenter networks

Proceedings of the 1st ACM workshop on Research on enterprise networking
Data center TCP (DCTCP)

Proceedings of the ACM SIGCOMM 2010 conference
Hedera: dynamic flow scheduling for data center networks

NSDI'10 Proceedings of the 7th USENIX conference on Networked systems design and implementation
High Performance Datacenter Networks: Architectures, Algorithms, & Opportunities

High Performance Datacenter Networks: Architectures, Algorithms, & Opportunities
Better never than late: meeting deadlines in datacenter networks

Proceedings of the ACM SIGCOMM 2011 conference
Improving datacenter performance and robustness with multipath TCP

Proceedings of the ACM SIGCOMM 2011 conference
Less is more: trading a little bandwidth for ultra-low latency in the data center

NSDI'12 Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation

Coflow: a networking abstraction for cluster applications

Proceedings of the 11th ACM Workshop on Hot Topics in Networks
Deconstructing datacenter packet transport

Proceedings of the 11th ACM Workshop on Hot Topics in Networks
Chronos: predictable low latency for data center applications

Proceedings of the Third ACM Symposium on Cloud Computing
Cake: enabling high-level SLOs on shared storage systems

Proceedings of the Third ACM Symposium on Cloud Computing
Bobtail: avoiding long tails in the cloud

nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
Reducing web latency: the virtue of gentle aggression

Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
Got loss? Get zOVN!

Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
Speeding up distributed request-response workflows

Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
pFabric: minimal near-optimal datacenter transport

Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
Don't drop, detour!

Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
Small is better: avoiding latency traps in virtualized data centers

Proceedings of the 4th annual Symposium on Cloud Computing
Per-packet load-balanced, low-latency routing for clos-based data center networks

Proceedings of the ninth ACM conference on Emerging networking experiments and technologies
Scalable, optimal flow routing in datacenters via local link balancing

Proceedings of the ninth ACM conference on Emerging networking experiments and technologies
Plinko: building provably resilient forwarding tables

Proceedings of the Twelfth ACM Workshop on Hot Topics in Networks
Dahu: commodity switches for direct connect data center networks

ANCS '13 Proceedings of the ninth ACM/IEEE symposium on Architectures for networking and communications systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Web applications have now become so sophisticated that rendering a typical page may require hundreds of intra-datacenter flows. At the same time, web sites must meet strict page creation deadlines of 200-300ms to satisfy user demands for interactivity. Long-tailed flow completion times make it challenging for web sites to meet these constraints. They are forced to choose between rendering a subset of the complex page, or delay its rendering, thus missing deadlines and sacrificing either quality or responsiveness. Either option leads to potential financial loss. In this paper, we present a new cross-layer network stack aimed at reducing the long tail of flow completion times. The approach exploits cross-layer information to reduce packet drops, prioritize latency-sensitive flows, and evenly distribute network load, effectively reducing the long tail of flow completion times. We evaluate our approach through NS-3 based simulation and Click-based implementation demonstrating our ability to consistently reduce the tail across a wide range of workloads. We often achieve reductions of over 50% in 99.9th percentile flow completion times.