Efficient range-constrained similarity search on wavelet synopses over multiple streams

  • Authors:
  • Hao-Ping Hung;Ming-Syan Chen

  • Affiliations:
  • National Taiwan University;National Taiwan University

  • Venue:
  • CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Due to the resource limitation in the data stream environment, it has been reported that answering user queries according to the wavelet synopsis of a stream is an essential ability of a Data Stream Management System (DSMS). In this paper, motivated by the fact that a user may be interested in an arbitrary range of the data streams, we investigate two important types of range-constrained queries in time series streaming environments: the distance queries (which aim at obtaining the Euclidean distance between two streams) and the kNN queries (which aim at discovering k nearest neighbors to a reference stream). To achieve high efficiency in processing these two types of queries, we propose procedure RED (standing for Range-constrained Euclidean Distance) and algorithm EKS (standing for Enhanced KNN Search). Compared to the existing methods in the prior research, the advantageous features of our approaches are in two folds. First, our approaches are capable of processing the queries directly from the wavelet synopses retained in the main memory without using IDWT to reconstruct the data cells. This feature allows us to save the complexity in both memory and time. Moreover, our approaches enable the users to query the DSMS within their range of interest. Unlike the conventional methods which only support the full-range query processing, this feature will enhance the flexibility at the client side. We evaluate procedure RED and algorithm EKS on live and synthetic datasets empirically and show that the proposed approaches are efficient in similarity search and kNN discovery within arbitrary ranges in the time series streaming environments.