DAPSS: exact subsequence matching for data streams

  • Authors:
  • Yasuhiro Fujiwara;Yasushi Sakurai;Masashi Yamamuro

  • Affiliations:
  • NTT Cyber Space Laboratories, NTT Corporation, Kanagawa, Japan;NTT Cyber Space Laboratories, NTT Corporation, Kanagawa, Japan;NTT Cyber Space Laboratories, NTT Corporation, Kanagawa, Japan

  • Venue:
  • DASFAA'06 Proceedings of the 11th international conference on Database Systems for Advanced Applications
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

There is much interest in the processing of data streams for applications in the fields such as financial analysis, network monitoring, mobile services, and sensor network management. The key characteristic of stream data, that it continues to arrive, demands a new approach. This paper focuses on the problem of detecting, exactly, similar pairs of subsequences of arbitrary length in streaming fashion. We propose DAPSS (DAta stream Processing for Store and Search), an efficient and effective method to detect the similar pairs, which keeps (1) the feature data of each sequence in the memory space and (2) the compressed data of the original sequences in the disk space. Experiments on synthetic and real data sets show that DAPSS is significantly (up to 35 times) faster than the naive method while it guarantees the correctness of query results.