Stochastic consistency, and scalable pull-based caching for erratic data stream sources

  • Authors:
  • Shanzhong Zhu;Chinya V. Ravishankar

  • Affiliations:
  • Department of Computer Science and Engineering University of California, Riverside, CA;Department of Computer Science and Engineering University of California, Riverside, CA

  • Venue:
  • VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

We introduce the notion of stochastic consistency, and propose a novel approach to achieving it for caches of highly erratic data. Erratic data sources, such as stock prices, sensor data, are common and important in practice. However, their erratic patterns of change make caching hard. Stochastic consistency guarantees that errors in cached values of erratic data remain within a user-specified bound, with a user-specified probability. We use a Brownian motion model to capture the behavior of data changes, and use its underlying theory to predict when caches should initiate pulls to refresh cached copies to maintain stochastic consistency. Our approach allows servers to remain totally stateless, thus achieving excellent scalability and reliability. We also discuss a new real-time scheduling approach for servicing pull requests at the server. Our scheduler delivers prompt response whenever possible, and minimizes the aggregate cache-source deviation due to delays during server overload. We conduct extensive experiments to validate our model on real-life datasets, and show that our scheme outperforms current schemes.