Replica-aware caching for Web proxies

Authors:
Hyokyung Bahn;Hyunsook Lee;Sam H. Noh;Sang Lyul Min;Kern Koh
Affiliations:
School of Computer Science and Engineering, Seoul National University, Seoul 151-742, South Korea;Adelinux Inc., 144-1, Samsung-dong, Kangnam-ku, Seoul 135-745, South Korea;School of Information and Computer Engineering, Hong-Ik University, Seoul 121-791, South Korea;School of Computer Science and Engineering, Seoul National University, Seoul 151-742, South Korea;School of Computer Science and Engineering, Seoul National University, Seoul 151-742, South Korea
Venue:
Computer Communications
Year:
2002

Citing 6
Cited 8

Copy detection mechanisms for digital documents

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Syntactic clustering of the Web

Selected papers from the sixth international conference on World Wide Web
Mirror, mirror on the Web: a study of host pairs with replicated content

WWW '99 Proceedings of the eighth international conference on World Wide Web
Proxy Cache Algorithms: Design, Implementation, and Performance

IEEE Transactions on Knowledge and Data Engineering
Finding Near-Replicas of Documents and Servers on the Web

WebDB '98 Selected papers from the International Workshop on The World Wide Web and Databases
Using full reference history for efficient document replacement in web caches

USITS'99 Proceedings of the 2nd conference on USENIX Symposium on Internet Technologies and Systems - Volume 2

Aliasing on the world wide web: prevalence and performance implications

Proceedings of the 11th international conference on World Wide Web
Value-based web caching

WWW '03 Proceedings of the 12th international conference on World Wide Web
Automatic detection of fragments in dynamically generated web pages

Proceedings of the 13th international conference on World Wide Web
Automatic Fragment Detection in Dynamic Web Pages and Its Impact on Caching

IEEE Transactions on Knowledge and Data Engineering
Performance evaluation of peer-to-peer Web caching systems

Journal of Systems and Software - Special issue: Quality software
Analyzing Document-Duplication Effects on Policies for Browser and Proxy Caching

INFORMS Journal on Computing
Design, implementation, and evaluation of duplicate transfer detection in HTTP

NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Supporting practical content-addressable caching with CZIP compression

ATC'07 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference

Quantified Score

Hi-index	0.24

Visualization

Abstract

A significant percentage of Web objects are replicas. For example, a vast majority of image files such as banners, buttons, and logos are duplicated throughout the WWW. Nevertheless, Web caching systems generally treat the replicas as different objects because they have different URLs. In this paper, we propose a simple and efficient way to manage the replicated objects for Web proxy caches. In the proposed scheme, the MD5 checksum, together with the size of an object, forms an identifier of a Web object that can distinguish replicas. Experimental results show that the proposed scheme significantly improves the cache hit rate and the byte hit rate by removing the redundant objects from the cache and reflecting the popularity of objects more precisely.