Systems with multiple servers under heavy-tailed workloads

Authors:
Konstantinos Psounis;Pablo Molinero-Fernández;Balaji Prabhakar;Fragkiskos Papadopoulos
Affiliations:
Departments of Electrical Engineering and Computer Science, University of Southern California, USA;Department of Electrical Engineering, Stanford University, USA;Departments of Electrical Engineering and Computer Science, Stanford University, USA;Department of Electrical Engineering, University of Southern California, USA
Venue:
Performance Evaluation - Performance 2005
Year:
2005

Citing 15
Cited 4

Wide area traffic: the failure of Poisson modeling

IEEE/ACM Transactions on Networking (TON)
Exploiting process lifetime distributions for dynamic load balancing

Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Self-similarity through high-variability: statistical analysis of Ethernet LAN traffic at the source level

IEEE/ACM Transactions on Networking (TON)
Generating representative Web workloads for network and server performance evaluation

SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Data networks as cascades: investigating the multifractal nature of Internet WAN traffic

Proceedings of the ACM SIGCOMM '98 conference on Applications, technologies, architectures, and protocols for computer communication
Load-balancing heuristics and process behavior

SIGMETRICS '86/PERFORMANCE '86 Proceedings of the 1986 ACM SIGMETRICS joint international conference on Computer performance modelling, measurement and evaluation
IP packet generation: statistical models for TCP start times based on connection-rate superposition

Proceedings of the 2000 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
On choosing a task assignment policy for a distributed server system

Journal of Parallel and Distributed Computing - Special issue on software support for distributed computing
Statistical bandwidth sharing: a study of congestion at flow level

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Task assignment with unknown duration

Journal of the ACM (JACM)
New bounds for expected delay in FIFO GI/GI/c queues

Queueing Systems: Theory and Applications
The M/G/c queue in light traffic

Queueing Systems: Theory and Applications
The impact of a heavy-tailed service-time distribution upon the M/GI/s waiting-time distribution

Queueing Systems: Theory and Applications
Changes in Web client access patterns: Characteristics and caching implications

World Wide Web
Approximations for the delay probability in the M/G/s queue

Mathematical and Computer Modelling: An International Journal

How many servers are best in a dual-priority M/PH/k system?

Performance Evaluation
Feedback Control-Based Database Connection Management for Proportional Delay Differentiation-Enabled Web Application Servers

NPC '08 Proceedings of the IFIP International Conference on Network and Parallel Computing
Surprising results on task assignment in server farms with high-variability workloads

Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
The impact of quanta on the performance of multi-level time sharing policy under heavy-tailed workloads

ACSC '09 Proceedings of the Thirty-Second Australasian Conference on Computer Science - Volume 91

Quantified Score

Hi-index	0.00

Visualization

Abstract

The heavy-tailed nature of Internet flow sizes, web pages and computer files can cause non-preemptive scheduling policies to have a large average response time. Since there are numerous communication and distributed processing systems where preempting jobs can be quite expensive, reducing response times under this constraint is a pressing issue. One proposal for tackling non-preemption is through the use of multiple servers: classify jobs according to size and assign a server to each class. Unfortunately, in most systems of interest, job sizes are unknown. An alterative is to queue all jobs together in a central-queue and assign them in a FCFS fashion to the next available server. But, this has been believed to yield large response times. In this paper, we argue that this is not the case, so long as there are enough servers. The question then is: what is the right number of servers, and is this small enough to be practical? Despite the large amount of prior work in analyzing the behavior of a central-queue system, no existing models are accurate for the case of heavy-tailed size distributions. Our main contribution is a simple yet accurate model for a central-queue with multiple servers. This model accurately predicts the right number of servers, and the average and variance of the response time of the system. Hence, it can be used to improve the performance of some real systems, such as multi-server supercomputing centers and multi-channel communication systems.