Size-based scheduling to improve web performance

Authors:
Mor Harchol-Balter;Bianca Schroeder;Nikhil Bansal;Mukesh Agrawal
Affiliations:
Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA
Venue:
ACM Transactions on Computer Systems (TOCS)
Year:
2003

Citing 21
Cited 82

Improving HTTP latency

Computer Networks and ISDN Systems
Lazy receiver processing (LRP): a network subsystem architecture for server systems

OSDI '96 Proceedings of the second USENIX symposium on Operating systems design and implementation
Dummynet: a simple approach to the evaluation of network protocols

ACM SIGCOMM Computer Communication Review
Exploiting process lifetime distributions for dynamic load balancing

ACM Transactions on Computer Systems (TOCS)
Self-similarity in World Wide Web traffic: evidence and possible causes

IEEE/ACM Transactions on Networking (TON)
Generating representative Web workloads for network and server performance evaluation

SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Heavy-tailed probability distributions in the World Wide Web

A practical guide to heavy tails
Better operating system features for faster network servers

ACM SIGMETRICS Performance Evaluation Review
Flow and stretch metrics for scheduling continuous job streams

Proceedings of the ninth annual ACM-SIAM symposium on Discrete algorithms
Workload characterization of a Web proxy in a cable modem environment

ACM SIGMETRICS Performance Evaluation Review
Analysis of SRPT scheduling: investigating unfairness

Proceedings of the 2001 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Web protocols and practice: HTTP/1.1, Networking protocols, caching, and traffic measurement

Web protocols and practice: HTTP/1.1, Networking protocols, caching, and traffic measurement
Server operating systems

EW 7 Proceedings of the 7th workshop on ACM SIGOPS European workshop: Systems support for worldwide applications
Operating System Concepts

Operating System Concepts
Measuring the capacity of a Web server under realistic loads

World Wide Web
The case for geographical push-caching

HOTOS '95 Proceedings of the Fifth Workshop on Hot Topics in Operating Systems (HotOS-V)
A parallel workload model and its implications for processor allocation

HPDC '97 Proceedings of the 6th IEEE International Symposium on High Performance Distributed Computing
Application-level document caching in the Internet

SDNE '95 Proceedings of the 2nd International Workshop on Services in Distributed and Networked Environments
Connection scheduling in web servers

USITS'99 Proceedings of the 2nd conference on USENIX Symposium on Internet Technologies and Systems - Volume 2
Web facts and fantasy

USITS'97 Proceedings of the USENIX Symposium on Internet Technologies and Systems on USENIX Symposium on Internet Technologies and Systems
Flash: an efficient and portable web server

ATEC '99 Proceedings of the annual conference on USENIX Annual Technical Conference

Handling Multiple Bottlenecks in Web Servers Using Adaptive Inbound Controls

PIHSN '02 Proceedings of the 7th IFIP/IEEE International Workshop on Protocols for High Speed Networks
Analysis of LAS scheduling for job size distributions with high variance

SIGMETRICS '03 Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Fairness and efficiency in web server protocols

SIGMETRICS '03 Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Class-Dependent Assignment in cluster-based servers

Proceedings of the 2004 ACM symposium on Applied computing
A resource-allocation queueing fairness measure

Proceedings of the joint international conference on Measurement and modeling of computer systems
The origins of network server latency & the myth of connection scheduling

Proceedings of the joint international conference on Measurement and modeling of computer systems
A survey on statistical bandwidth sharing

Computer Networks: The International Journal of Computer and Telecommunications Networking - Special issue: In memroy of Olga Casals
Formalizing SMART scheduling

ACM SIGMETRICS Performance Evaluation Review
Traffic aided opportunistic scheduling for wireless networks: algorithms and performance bounds

Computer Networks: The International Journal of Computer and Telecommunications Networking
Workload-Aware Load Balancing for Clustered Web Servers

IEEE Transactions on Parallel and Distributed Systems
Reconfigurable, Data-Driven Resource Allocation in Complex Systems: Practice and Theoretical Foundations

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 10 - Volume 11
Nearly insensitive bounds on SMART scheduling

SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Classifying scheduling policies with respect to higher moments of conditional response time

SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Fair operation of multi-server and multi-queue systems

SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Resource allocation between persistent and transient flows

IEEE/ACM Transactions on Networking (TON)
Robust Processing Rate Allocation for Proportional Slowdown Differentiation on Internet Servers

IEEE Transactions on Computers
Deferred Assignment Scheduling in Cluster-Based Servers

Cluster Computing
Stability of size-based scheduling disciplines in resource-sharing networks

Performance Evaluation - Performance 2005
Web servers under overload: How scheduling can help

ACM Transactions on Internet Technology (TOIT)
Fluid and diffusion limits for transient sojourn times of processor sharing queues with time varying rates

Queueing Systems: Theory and Applications
Selective early request termination for busy internet services

Proceedings of the 15th international conference on World Wide Web
Tail asymptotics for policies favoring short jobs in a many-flows regime

SIGMETRICS '06/Performance '06 Proceedings of the joint international conference on Measurement and modeling of computer systems
Resource Allocation for Session-Based Two-Dimensional Service Differentiation on e-Commerce Servers

IEEE Transactions on Parallel and Distributed Systems
Revisiting unfairness in web server scheduling

Computer Networks: The International Journal of Computer and Telecommunications Networking
Sojourn times in (discrete) time shared systems and their continuous time limits

valuetools '06 Proceedings of the 1st international conference on Performance evaluation methodolgies and tools
Tail equivalence for some time-shared systems

valuetools '06 Proceedings of the 1st international conference on Performance evaluation methodolgies and tools
On the effect of inexact size information in size based policies

ACM SIGMETRICS Performance Evaluation Review
Task assignment with work-conserving migration

Parallel Computing
Fairness and classifications

ACM SIGMETRICS Performance Evaluation Review
Scheduling in practice

ACM SIGMETRICS Performance Evaluation Review
PBS: a unified priority-based scheduler

Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Adaptive and scalable comparison scheduling

Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Analysis of join-the-shortest-queue routing for web server farms

Performance Evaluation
Designing an overload control strategy for secure e-commerce applications

Computer Networks: The International Journal of Computer and Telecommunications Networking
The Foreground-Background queue: A survey

Performance Evaluation
Hardware counter driven on-the-fly request signatures

Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Immediate mode scheduling in grid systems

International Journal of Web and Grid Services
Dynamic CPU provisioning for self-managed secure web applications in SMP hosting platforms

Computer Networks: The International Journal of Computer and Telecommunications Networking
Asymptotic properties of sojourn times in multiclass time-shared systems

Probability in the Engineering and Informational Sciences
On the performance of persistent connection in modern web servers

Proceedings of the 2008 ACM symposium on Applied computing
Scheduling despite inexact job-size information

SIGMETRICS '08 Proceedings of the 2008 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Effective load balancing for cluster-based servers employing job preemption

Performance Evaluation
Handling HTTP flows over a DiffServ framework

Proceedings of the 4th international IFIP/ACM Latin American conference on Networking
Application controlled caching for web servers

Enterprise Information Systems
The effect of local scheduling in load balancing designs

ACM SIGMETRICS Performance Evaluation Review
Improving flow level fairness and interactivity in WLANs using size-based scheduling policies

Proceedings of the 11th international symposium on Modeling, analysis and simulation of wireless and mobile systems
Improving peer-to-peer performance through server-side scheduling

ACM Transactions on Computer Systems (TOCS)
Feedback Control-Based Database Connection Management for Proportional Delay Differentiation-Enabled Web Application Servers

NPC '08 Proceedings of the IFIP International Conference on Network and Parallel Computing
Sojourn times in (discrete) time shared systems and their continuous time limits

Queueing Systems: Theory and Applications
Profit-aware overload protection in E-commerce Web sites

Journal of Network and Computer Applications
A class-based scheme for E-commerce web servers: Formal specification and performance evaluation

Journal of Network and Computer Applications
Dynamic thread assignment in web server performance optimization

Performance Evaluation
On the Variance of the Least Attained Service Policy and Its Use in Multiple Bottleneck Networks

Network Control and Optimization
Scheduling performance of heavy-tailed data traffic in wireless high-speed shared channels

WCNC'09 Proceedings of the 2009 IEEE conference on Wireless Communications & Networking Conference
Understanding Internet Video sharing site workload: A view from data center design

Journal of Visual Communication and Image Representation
Declarative scheduling in highly scalable systems

Proceedings of the 2010 EDBT/ICDT Workshops
Design and implementation of a generic resource sharing virtual time dispatcher

Proceedings of the 3rd Annual Haifa Experimental Systems Conference
Size-based scheduling: a recipe for DDOS?

Proceedings of the 17th ACM conference on Computer and communications security
The average response time in a heavy-traffic srpt queue

ACM SIGMETRICS Performance Evaluation Review
Providing web service of established quality with the use of HTTP requests scheduling methods

KES-AMSTA'10 Proceedings of the 4th KES international conference on Agent and multi-agent systems: technologies and applications, Part I
Discovering and usage of customer knowledge in QoS mechanism for B2C web server systems

KES'10 Proceedings of the 14th international conference on Knowledge-based and intelligent information and engineering systems: Part II
The study on schedule policy of data transferring in storage network

International Journal of High Performance Computing and Networking
Size-based and direction-based TCP fairness issues in IEEE 802.11 WLANs

EURASIP Journal on Wireless Communications and Networking
Sharing the data center network

Proceedings of the 8th USENIX conference on Networked systems design and implementation
Review: Task assignment policies in distributed server systems: A survey

Journal of Network and Computer Applications
Size-based flow-scheduling using spike-detection

ASMTA'11 Proceedings of the 18th international conference on Analytical and stochastic modeling techniques and applications
LDMA: load balancing using decentralized decision making mobile agents

ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part IV
Regression-based resource provisioning for session slowdown guarantee in multi-tier Internet servers

Journal of Parallel and Distributed Computing
Monotonicity and efficient computation of bounds with time parallel simulation

EPEW'11 Proceedings of the 8th European conference on Computer Performance Engineering
A weighted-fair-queuing (WFQ)-based dynamic request scheduling approach in a multi-core system

Future Generation Computer Systems
Participatory networking

Hot-ICE'12 Proceedings of the 2nd USENIX conference on Hot Topics in Management of Internet, Cloud, and Enterprise Networks and Services
Pricing and distributed QoS control for elastic network traffic

Operations Research Letters
Analysis and control of correlated web server queues

Computer Communications
A spike-detecting AQM to deal with elephants

Computer Networks: The International Journal of Computer and Telecommunications Networking
Completion time scheduling and the WSRPT algorithm

ISCO'12 Proceedings of the Second international conference on Combinatorial Optimization
When average is not average: large response time fluctuations in n-tier systems

Proceedings of the 9th international conference on Autonomic computing
Adaptive scheduling system guaranteeing web page response times

ICCCI'12 Proceedings of the 4th international conference on Computational Collective Intelligence: technologies and applications - Volume Part II
User-defined schedulers for real-time concurrent objects

Innovations in Systems and Software Engineering
Deadline-based resource management for information-centric networks

Proceedings of the 3rd ACM SIGCOMM workshop on Information-centric networking
Small is better: avoiding latency traps in virtualized data centers

Proceedings of the 4th annual Symposium on Cloud Computing
Joint optimization of overlapping phases in MapReduce

Performance Evaluation
Decoupled speed scaling: Analysis and evaluation

Performance Evaluation

Quantified Score

Hi-index	0.00

Visualization

Abstract

Is it possible to reduce the expected response time of every request at a web server, simply by changing the order in which we schedule the requests? That is the question we ask in this paper.This paper proposes a method for improving the performance of web servers servicing static HTTP requests. The idea is to give preference to requests for small files or requests with short remaining file size, in accordance with the SRPT (Shortest Remaining Processing Time) scheduling policy.The implementation is at the kernel level and involves controlling the order in which socket buffers are drained into the network. Experiments are executed both in a LAN and a WAN environment. We use the Linux operating system and the Apache and Flash web servers.Results indicate that SRPT-based scheduling of connections yields significant reductions in delay at the web server. These result in a substantial reduction in mean response time and mean slowdown for both the LAN and WAN environments. Significantly, and counter to intuition, the requests for large files are only negligibly penalized or not at all penalized as a result of SRPT-based scheduling.