QACO: exploiting partial execution in web servers

Authors:
Jinhan Kim;Sameh Elnikety;Yuxiong He;Seung-won Hwang;Shaolei Ren
Affiliations:
Pohang University of Science and Technology;Microsoft Research;Microsoft Research;Pohang University of Science and Technology;Florida International University
Venue:
Proceedings of the 2013 ACM Cloud and Autonomic Computing Conference
Year:
2013

Citing 24
Cited 0

Web content adaptation to improve server overload behavior

WWW '99 Proceedings of the eighth international conference on World Wide Web
Session-Based Admission Control: A Mechanism for Peak Load Management of Commercial Web Sites

IEEE Transactions on Computers
Detecting web page structure for adaptive viewing on small form factor devices

WWW '03 Proceedings of the 12th international conference on World Wide Web
Queueing Model Based Network Server Performance Control

RTSS '02 Proceedings of the 23rd IEEE Real-Time Systems Symposium
Implementing Quality of Service in Web Servers

SRDS '02 Proceedings of the 21st IEEE Symposium on Reliable Distributed Systems
A Feedback Control Approach for Guaranteeing Relative Delays in Web Servers

RTAS '01 Proceedings of the Seventh Real-Time Technology and Applications Symposium (RTAS '01)
Feedback Control of Computing Systems

Feedback Control of Computing Systems
Convex Optimization

Convex Optimization
Timing Performance Control in Web Server Systems Utilizing Server Internal State Information

ICAS-ICNS '05 Proceedings of the Joint International Conference on Autonomic and Autonomous Systems and International Conference on Networking and Services
Web servers under overload: How scheduling can help

ACM Transactions on Internet Technology (TOIT)
Multimedia over IP and Wireless Networks: Compression, Networking, and Systems

Multimedia over IP and Wireless Networks: Compression, Networking, and Systems
Open versus closed: a cautionary tale

NSDI'06 Proceedings of the 3rd conference on Networked Systems Design & Implementation - Volume 3
Dynamo: amazon's highly available key-value store

Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
Decomposition principles and online learning in cross-layer optimization for delay-sensitive applications

IEEE Transactions on Signal Processing
Green: a framework for supporting energy-conscious programming using controlled approximation

PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Automated control for elastic storage

Proceedings of the 7th international conference on Autonomic computing
Stochastic approximation control of power and tardiness in a three-tier web-hosting cluster

Proceedings of the 7th international conference on Autonomic computing
A distributed control framework for performance management of virtualized computing environments

Proceedings of the 7th international conference on Autonomic computing
Statistical QoS provisionings for wireless unicast/multicast of multi-layer video streams

IEEE Journal on Selected Areas in Communications
Dynamic knobs for responsive power-aware computing

Proceedings of the sixteenth international conference on Architectural support for programming languages and operating systems
Tians Scheduling: Using Partial Processing in Best-Effort Applications

ICDCS '11 Proceedings of the 2011 31st International Conference on Distributed Computing Systems
Queueing-Model-Based Adaptive Control of Multi-Tiered Web Applications

IEEE Transactions on Network and Service Management
Optimal Resource Allocation for Multimedia Applications over Multiaccess Fading Channels

IEEE Transactions on Wireless Communications
Budget-based control for interactive services with adaptive execution

Proceedings of the 9th international conference on Autonomic computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Web servers provide content to users, with the requirement of providing high response quality within a short response time. Meeting these requirements is challenging, especially in the event of load spikes. Meanwhile, we observe that a response to a request can be adapted or partially executed depending on current resource availability at the server. For example, a web server can choose to send a low or medium resolution image instead of sending the original high resolution image under resource contention. In this paper, we exploit partial execution to expose a trade off between resource consumption and service quality. We show how to manage server resources to improve service quality and responsiveness. Specifically, we develop a framework, called Quota-based Control Optimization (QACO). The quota represents the total amount of resources available for all pending requests. QACO consists of two modules: (1) A control module adjusts the quota to meet the response time target. (2) An optimization module exploits partial execution and allocates the quota to pending requests in a manner that improves total response quality. We evaluate the framework using a system implementation in the Apache Web server, and using a simulation study of a Video-on-Demand server. The results show that under a response time target, QACO achieves a higher response quality than traditional techniques that admit or reject requests without exploiting partial execution.