Deciding which queue to join: Some counterexamples
Operations Research
Analysis of Task Assignment Policies in Scalable Distributed Web-Server Systems
IEEE Transactions on Parallel and Distributed Systems
Locality-aware request distribution in cluster-based network servers
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Heavy-tailed probability distributions in the World Wide Web
A practical guide to heavy tails
On power-law relationships of the Internet topology
Proceedings of the conference on Applications, technologies, architectures, and protocols for computer communication
Workload characterization of a Web proxy in a cable modem environment
ACM SIGMETRICS Performance Evaluation Review
Task assignment with unknown duration
Journal of the ACM (JACM)
The state of the art in locally distributed Web-server systems
ACM Computing Surveys (CSUR)
Scheduling Algorithms
Probability Models for Computer Science
Probability Models for Computer Science
EQUILOAD: a load balancing policy for clustered web servers
Performance Evaluation
Scheduling in Multiclass Networks with Deterministic Service Times
Queueing Systems: Theory and Applications
Size-based scheduling to improve web performance
ACM Transactions on Computer Systems (TOCS)
ADAPTLOAD: Effective Balancing in Custered Web Servers Under Transient Load Conditions
ICDCS '02 Proceedings of the 22 nd International Conference on Distributed Computing Systems (ICDCS'02)
Flash: an efficient and portable web server
ATEC '99 Proceedings of the annual conference on USENIX Annual Technical Conference
A workload characterization study of the 1998 World Cup Web site
IEEE Network: The Magazine of Global Internetworking
CONTROL'08 Proceedings of the 4th WSEAS/IASME international conference on Dynamical systems and control
The Journal of Supercomputing
A predictive and probabilistic load-balancing algorithm for cluster-based web servers
Applied Soft Computing
Hi-index | 0.00 |
A cluster-based server consists of a front-end dispatcher and several back-end servers. The dispatcher receives incoming requests, and then assigns them to back-end servers for processing. Our goal is to devise an assignment policy that has good response time performance, and is practical to implement in that the amount of information used by the dispatcher is relatively small, so that the attendant computation and communication overheads are low. In contrast to extant assignment policies that apply the same assignment policy to all incoming jobs, our approach calls for the dispatcher to classify incoming jobs as long or short, and then use class-dependent assignment policies. Specifically, we propose a policy, called CDA (Class Dependent Assignment), where short jobs are assigned in Round-Robin manner as soon as they arrive, while long jobs are deferred and assigned only when a back-end server becomes idle. Furthermore, when processing a long job, a back-end server is not assigned any other jobs.Our approach is motivated by empirical evidence suggesting that the sizes of files traveling on the Internet follow power-law distributions, where long jobs constituting a small fraction of all incoming jobs actually account for a large fraction of the overall load. To gauge the performance of the proposed policy, we exercised it on empirical data traces measured at Internet sites serving the 1998 World Cup. Since the assignment of long jobs incurs computational overhead as well as extra communication overhead, we studied the performance of CDA as function of the fraction of jobs classified as long. Our study shows that classification of even a small fraction of jobs as long can have a profound impact on overall response time performance. More speciafically, our experimental results show that if less than 3% of the jobs are classified as long, then CDA outperforms traditional policies, such as Round-Robin, by two orders of magnitude. From an implementation viewpoint, these results support our contention that CDA-based assignment is a practical policy combining low overhead and greatly improved performance.