Query indexing with containment-encoded intervals for efficient stream processing

  • Authors:
  • Kun-Lung Wu;Shyh-Kwei Chen;S. Yu

  • Affiliations:
  • IBM T.J. Watson Research Center, 19 Skyline Drive, 10532, Hawthorne, NY, USA;IBM T.J. Watson Research Center, 19 Skyline Drive, 10532, Hawthorne, NY, USA;IBM T.J. Watson Research Center, 19 Skyline Drive, 10532, Hawthorne, NY, USA

  • Venue:
  • Knowledge and Information Systems
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many continual range queries can be issued against data streams. To efficiently evaluate continual queries against a stream, a main memory-based query index with a small storage cost and a fast search time is needed, especially if the stream is rapid. In this paper, we study a CEI-based query index that meets both criteria for efficient processing of continual interval queries. This new query index is an indirect indexing approach. It centres around a set of predefined virtual containment-encoded intervals, or CEIs. The CEIs are used to first decompose query intervals and then perform efficient search operations. The CEIs are defined and labeled such that containment relationships among them are encoded in their IDs. The containment encoding makes decomposition and search operations efficient; from the encoding of the smallest CEI containing a data point, the encodings of other containing CEIs can be easily derived. Closed-form formulae for the bounds of the average index storage cost are derived. Simulations are conducted to evaluate the effectiveness of the CEI-based query index and to compare it with alternative approaches. The results show that the CEI-based query index significantly outperforms existing approaches in terms of both storage cost and search time.