Approximate membership query over time-decaying windows for event stream processing

  • Authors:
  • Yang Liu;Wenji Chen;Yong Guan

  • Affiliations:
  • Iowa State University, Ames, Iowa;Iowa State University, Ames, Iowa;Iowa State University, Ames, Iowa

  • Venue:
  • Proceedings of the 6th ACM International Conference on Distributed Event-Based Systems
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

There has been a long history of finding a space-efficient data structure to support approximate membership queries, started from Bloom's work in the 1970's. Given a set A of n items and an additional item x from the same universe u of a size m ≫ n, we want to distinguish whether x ∈ A or not, using small (limited) space. If A is static, there exist optimal algorithms to find a randomized data structure to represent A using only (1 + o(1))n log 1/δ bits, which only allows for a small false positive δ but no false negative. However, existing optimal algorithms are not practical for many event-based systems, e. g., web services, peer-to-peer systems, network traffic monitoring, etc. In these systems, items are inserted or updated dynamically in a stream of events, and we are interested in recently updated items. In this paper, we propose a novel data structure to support approximate membership queries in a time-decaying window model. In this model, items are inserted one-by-one over a data stream, and we want to determine whether an item is among the most recent w items for any given window size w ≤ n. Our data structure only requires O(n(log 1/δ + log n)) bits and O(1) running time.