Design and evaluation of web proxies by leveraging self-similarity of web traffic

  • Authors:
  • Rachid El Abdouni Khayari

  • Affiliations:
  • University of the Armed Forces, Munich Department of Computer Science, Neubiberg, Germany

  • Venue:
  • Computer Networks: The International Journal of Computer and Telecommunications Networking - Special issue: Network modelling and simulation
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper a new concept for the analysis of communication systems and their performance is presented. Our point of view is that insights in system workload can help in developing new methods for improving the perceived system performance. From measurements, we gained ideas about the typical request pattern. These obtained insights have been used in developing new methods for improving the system performance. To validate these new approaches, simulations have been used.First, we present a fitting algorithm which directly deals with measurement data instead of an intermediate heavy-tailed distribution. This method provides good results for approximating the object-size distribution as well as the performance measures in an M|G|1 queue. The results of the fitting procedure allow a classification of the considered events; they provide a perfect classification of the space of data sizes in different classes.Furthermore, we develop a new caching algorithm, class-based, least recently used (C-LRU), with the aim to obtain a good balance between small and large documents in the cache. Similarly, the new scheduling algorithm, class-based interleaving weighted fair queueing (CI-WFQ), exploits the distribution of the object sizes being requested to set its parameters such that good mean response times are obtained and starvation does not occur. We have found that both methods are suitable for the use in Web proxy servers, and present, in many cases, an improvement over the yet existing strategies. For the comparison of the methods, we have used trace-driven simulations. Both algorithms are parameterized using information on the requested object-size distribution. In this way, they can be seen as potentially adaptive to the considered workload.