Performance Guarantees for Web Server End-Systems: A Control-Theoretical Approach
IEEE Transactions on Parallel and Distributed Systems
Traffic model and performance evaluation of Web servers
Performance Evaluation
Priority Inheritance Protocols: An Approach to Real-Time Synchronization
IEEE Transactions on Computers
Adaptive Load Control in Transaction Processing Systems
VLDB '91 Proceedings of the 17th International Conference on Very Large Data Bases
Understanding the Long-Term Self-Similarity of Internet Traffic
COST 263 Proceedings of the Second International Workshop on Quality of Future Internet Services
Handbook of Scheduling: Algorithms, Models, and Performance Analysis
Handbook of Scheduling: Algorithms, Models, and Performance Analysis
Queueing Networks and Markov Chains
Queueing Networks and Markov Chains
An analytical model for multi-tier internet services and its applications
SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
VSched: Mixing Batch And Interactive Virtual Machines Using Periodic Real-time Scheduling
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Lottery scheduling: flexible proportional-share resource management
OSDI '94 Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation
Thrashing: its causes and prevention
AFIPS '68 (Fall, part I) Proceedings of the December 9-11, 1968, fall joint computer conference, part I
Web services QoS: external SLAs and internal policies or: how do we deliver what we promise?
WISEW'03 Proceedings of the Fourth international conference on Web information systems engineering workshops
Future Generation Computer Systems
Hi-index | 0.00 |
At the moment the service market is experiencing a continuous growth, as services allow easy and quick enhancing of new and existing applications. However, hosting services according to a common on-premise model is not sufficient for dealing with erratic, spike-prone service loads. A new more promising approach is hosting services in the cloud (utility computing), which enables dynamic resource allocation. The last provides an opportunity to meet average response time requirements even in case of long-term fluctuating loads. Unfortunately, in the presence of short term fluctuations the resources utilization has to stay under 50% in order to achieve response time of the same order as job sizes. In this work we suggest to compensate the problem of underutilization caused by hosting low-latency services by means of allocating the remaining resources to time insensitive service requests. This solution uses load balancing combined with admission control and scheduling application server threads. The proposed approach is evaluated by means of experiments with the prototype conducted with Amazon's EC2. The experimental results show that the servers utilization can be increased without penalizing low-latency requests.