Efficient Construction of Compact Shedding Filters for Data Stream Processing

  • Authors:
  • Bugra Gedik;Kun-Lung Wu;Philip S. Yu

  • Affiliations:
  • Thomas J. Watson Research Center, IBM Research, 19 Skyline Dr, Hawthorne, NY 10532. bgedik@us.ibm.com;Thomas J. Watson Research Center, IBM Research, 19 Skyline Dr, Hawthorne, NY 10532. klwu@us.ibm.com;Thomas J. Watson Research Center, IBM Research, 19 Skyline Dr, Hawthorne, NY 10532. psyu@us.ibm.com

  • Venue:
  • ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

High-volume source streams, coupled with fluctuating rates, necessitate adaptive load shedding in data stream processing. When ignored, a continual query (CQ) server may randomly drop items, when its capacity is inadequate to handle the arriving data, and degrade the quality of the query results. To alleviate this problem, filters can be used at the source nodes. However, regular source filtering in itself is not sufficient to prevent random dropping, because the amount of data passing through the filters can still surpass the server's capacity. In this case, intelligent load shedding can be applied by the source filters to minimize the degradation in result quality. In this paper, we introduce a novel type of load-shedding source filters, called Non-uniformly Regulated (NR) sifters. An NR sifter judiciously applies varying amounts of load shedding to different regions of the data space within the sifter. We formulate the problem of constructing NR sifters as an optimization one. NR sifters are compact and quickly configurable, allowing frequent adaptations, and provide fast lookup for deciding if a data item should be dropped. We structure NR sifters as a set of (sifter region, drop threshold) pairs to achieve compactness, develop query consolidation techniques to enable quick construction, and introduce flexible space partitioning mechanisms to realize fast lookup.