Scalable dissemination: what's hot and what's not

  • Authors:
  • Jonathan Beaver;Nicholas Morsillo;Kirk Pruhs;Panos K. Chrysanthis;Vincenzo Liberatore

  • Affiliations:
  • University of Pittsburgh, Pittsburgh, PA;University of Pittsburgh, Pittsburgh, PA;University of Pittsburgh, Pittsburgh, PA;University of Pittsburgh, Pittsburgh, PA;Case Western Reserve University, Cleveland, Ohio

  • Venue:
  • Proceedings of the 7th International Workshop on the Web and Databases: colocated with ACM SIGMOD/PODS 2004
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

A major problem in web database applications and on the Internet in general is the scalable delivery of data. One proposed solution for this problem is a hybrid system that uses multicast push to scalably deliver the most popular data, and reserves traditional unicast pull for delivery of less popular data. However, such a hybrid scheme introduces a variety of data management problems at the server. In this paper we examine three of these problems: the push popularity problem, the document classification problem, and the bandwidth division problem. The push popularity problem is to estimate the popularity of the documents in the web site. The document classification problem is to determine which documents should be pushed and which documents must be pulled. The band-width division problem is to determine how much of the server bandwidth to devote to pushed documents and how much of the server bandwidth should be reserved for pulled documents. We propose simple and elegant solutions for these problems. We report on experiments with our system that validate our algorithms.