httperf—a tool for measuring web server performance
ACM SIGMETRICS Performance Evaluation Review
Session-Based Admission Control: A Mechanism for Peak Load Management of Commercial Web Sites
IEEE Transactions on Computers
Kernel Mechanisms for Service Differentiation in Overloaded Web Servers
Proceedings of the General Track: 2002 USENIX Annual Technical Conference
Dynamic Provisioning of Multi-tier Internet Applications
ICAC '05 Proceedings of the Second International Conference on Automatic Computing
Adaptive quality of service management for enterprise services
ACM Transactions on the Web (TWEB)
Cataclysm: Scalable overload policing for internet applications
Journal of Network and Computer Applications
Handling flash crowds from your garage
ATC'08 USENIX 2008 Annual Technical Conference on Annual Technical Conference
Multi-mode energy management for multi-tier server clusters
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Automated control of multiple virtualized resources
Proceedings of the 4th ACM European conference on Computer systems
Q-clouds: managing performance interference effects for QoS-aware clouds
Proceedings of the 5th European conference on Computer systems
Stochastic approximation control of power and tardiness in a three-tier web-hosting cluster
Proceedings of the 7th international conference on Autonomic computing
NapSAC: design and implementation of a power-proportional web cluster
Proceedings of the first ACM SIGCOMM workshop on Green networking
The SCADS director: scaling a distributed storage system under stringent performance requirements
FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
Kaleidoscope: cloud micro-elasticity via VM state coloring
Proceedings of the sixth conference on Computer systems
Workload analysis of a large-scale key-value store
Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems
A workload characterization study of the 1998 World Cup Web site
IEEE Network: The Magazine of Global Internetworking
A case for exploiting subarray-level parallelism (SALP) in DRAM
Proceedings of the 39th Annual International Symposium on Computer Architecture
Hi-index | 0.00 |
Dynamic capacity provisioning is a well studied approach to handling gradual changes in data center load. However, abrupt spikes in load are still problematic in that the work in the system rises very quickly during the setup time needed to turn on additional capacity. Performance can be severely affected even if it takes only 5 seconds to bring additional capacity online. In this paper, we propose SOFTScale, an approach to handling load spikes in multi-tier data centers without having to over-provision resources. SOFTScale works by opportunistically stealing resources from other tiers to alleviate the bottleneck tier, even when the tiers are carefully provisioned at capacity. SOFTScale is especially useful during the transient overload periods when additional capacity is being brought online. Via implementation on a 28-server multi-tier testbed, we investigate a range of possible load spikes, including an artificial doubling or tripling of load, as well as large spikes in real traces. We find that SOFTScale can meet our stringent 95th percentile response time Service Level Agreement goal of 500ms without using any additional resources even under some extreme load spikes that would normally cause the system (without SOFTScale) to exhibit response times as high as 96 seconds.