SEDA: an architecture for well-conditioned, scalable internet services
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Using Cohort-Scheduling to Enhance Server Performance
ATEC '02 Proceedings of the General Track of the annual conference on USENIX Annual Technical Conference
Size-based scheduling to improve web performance
ACM Transactions on Computer Systems (TOCS)
Connection scheduling in web servers
USITS'99 Proceedings of the 2nd conference on USENIX Symposium on Internet Technologies and Systems - Volume 2
Flash: an efficient and portable web server
ATEC '99 Proceedings of the annual conference on USENIX Annual Technical Conference
Application controlled caching for web servers
Enterprise Information Systems
Chronos: predictable low latency for data center applications
Proceedings of the Third ACM Symposium on Cloud Computing
Hi-index | 0.00 |
We investigate the origins of server-induced latency to understand how to improve latency optimization techniques. Using the Flash Web server [4], we analyze latency behavior under various loads. Despite latency profiles that suggest standard queuing delays, we find that most latency actually originates from negative interactions between the application and the locking and blocking mechanisms in the kernel. Modifying the server and kernel to avoid these problems yields both qualitative and quantitative changes in the latency profiles -- latency drops by more than an order of magnitude, and the effective service discipline also improves.We find our modifications also mitigate service burstiness in the application, reducing the event queue lengths dramatically and eliminating any benefit from application-level connection scheduling. We identify one remaining source of unfairness, related to competition in the networking stack. We show that adjusting the TCP congestion window size addresses this problem, reducing latency by an additional factor of three.