The origins of network server latency & the myth of connection scheduling

Authors:
Yaoping Ruan;Vivek S. Pai
Affiliations:
Princeton University, Princeton, NJ;Princeton University, Princeton, NJ
Venue:
Proceedings of the joint international conference on Measurement and modeling of computer systems
Year:
2004

Citing 5
Cited 2

SEDA: an architecture for well-conditioned, scalable internet services

SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Using Cohort-Scheduling to Enhance Server Performance

ATEC '02 Proceedings of the General Track of the annual conference on USENIX Annual Technical Conference
Size-based scheduling to improve web performance

ACM Transactions on Computer Systems (TOCS)
Connection scheduling in web servers

USITS'99 Proceedings of the 2nd conference on USENIX Symposium on Internet Technologies and Systems - Volume 2
Flash: an efficient and portable web server

ATEC '99 Proceedings of the annual conference on USENIX Annual Technical Conference

Application controlled caching for web servers

Enterprise Information Systems
Chronos: predictable low latency for data center applications

Proceedings of the Third ACM Symposium on Cloud Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

We investigate the origins of server-induced latency to understand how to improve latency optimization techniques. Using the Flash Web server [4], we analyze latency behavior under various loads. Despite latency profiles that suggest standard queuing delays, we find that most latency actually originates from negative interactions between the application and the locking and blocking mechanisms in the kernel. Modifying the server and kernel to avoid these problems yields both qualitative and quantitative changes in the latency profiles -- latency drops by more than an order of magnitude, and the effective service discipline also improves.We find our modifications also mitigate service burstiness in the application, reducing the event queue lengths dramatically and eliminating any benefit from application-level connection scheduling. We identify one remaining source of unfairness, related to competition in the networking stack. We show that adjusting the TCP congestion window size addresses this problem, reducing latency by an additional factor of three.