An efficient and resilient approach to filtering and disseminating streaming data

  • Authors:
  • Shetal Shah;Shyamshankar Dharmarajan;Krithi Ramamritham

  • Affiliations:
  • TCS Lab for Internet Research, Dept of Comp Science and Engg, Indian Institute of Technology Bombay, Mumbai, India;TCS Lab for Internet Research, Dept of Comp Science and Engg, Indian Institute of Technology Bombay, Mumbai, India;TCS Lab for Internet Research, Dept of Comp Science and Engg, Indian Institute of Technology Bombay, Mumbai, India

  • Venue:
  • VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many web users monitor dynamic data such as stock prices, real-time sensor data and traffic data for making on-line decisions. Instances of such data can be viewed as data streams. In this paper, we consider techniques for creating a resilient and efficient content distribution network for such dynamically changing streaming data. We address the problem of maintaining the coherency of dynamic data items in a network of repositories: data disseminated to one repository is filtered by that repository and disseminated to repositories dependent on it. Our method is resilient to link failures and repository failures. This resiliency implies that data fidelity is not lost even when the repository from which (or a communication path through which) a user obtains data experiences failures. Experimental evaluation, using real world traces of streaming data, demonstrates that (i) the (computational and communication) cost of adding this redundancy is low, and (ii) surprisingly, in many cases, adding resiliency enhancing features actually improves the fidelity provided by the system even in cases when there are no failures. To further enhance fidelity, we also propose efficient techniques for filtering data arriving at one repository and for scheduling the dissemination of filtered data to another repository. Our results show that the combination of resiliency enhancing and efficiency improving techniques in fact help derive the potential that push based systems are said to have in delivering 100% fidelity. Without them, computational and communication delays inherent in dissemination networks can lead to a large fidelity loss even in push based dissemination.