Demand-based document dissemination to reduce traffic and balance load in distributed information systems

  • Authors:
  • A. Bestavros

  • Affiliations:
  • -

  • Venue:
  • SPDP '95 Proceedings of the 7th IEEE Symposium on Parallel and Distributeed Processing
  • Year:
  • 1995

Quantified Score

Hi-index 0.00

Visualization

Abstract

Research on replication techniques to reduce traffic and minimize the latency of information retrieval in a distributed system has concentrated on client-based caching, whereby recently/frequently accessed information is cached at a client (or at a proxy thereof) in anticipation of future accesses. We believe that such myopic solutions-focussing exclusively on a particular client or set of clients-are likely to have a limited impact. Instead, we offer a solution that allows the replication of information to be done on a global supply/demand basis. We propose a hierarchical demand-based replication strategy that optimally disseminates information from its producer to servers that are closer to its consumers. The level of dissemination depends on the relative popularity of documents, and on the expected reduction in traffic that results from such dissemination. We used extensive HTTP logs to validate an analytical model of server popularity and file access profiles. Using that model we show that by disseminating the most popular documents on servers closer to clients, network traffic could be reduced considerably, while servers are load-balanced. We argue that this process could be generalized to provide for an automated server-based information dissemination protocol that will be more effective in reducing both network bandwidth and document retrieval times than client-based caching protocols.