A regression-based temporal pattern mining scheme for data streams

  • Authors:
  • Wei-Guang Teng;Ming-Syan Chen;Philip S. Yu

  • Affiliations:
  • Electrical Engineering Department, National Taiwan University, Taipei, Taiwan, ROC;Electrical Engineering Department, National Taiwan University, Taipei, Taiwan, ROC;IBM T. J. Watson Research Center, Yorktown, NY

  • Venue:
  • VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
  • Year:
  • 2003

Quantified Score

Hi-index 0.01

Visualization

Abstract

We devise in this paper a regression-based algorithm, called algorithm FTP-DS (Frequent Temporal Patterns of Data Streams), to mine frequent temporal patterns for data streams. While providing a general framework of pattern frequency counting, algorithm FTP-DS has two major features, namely one data scan for online statistics collection and regression-based compact pattern representation.To attain the feature of one data scan, the data segmentation and the pattern growth scenarios are explored for the frequency counting purpose. Algorithm FTP-DS scans online transaction flows and generates candidate frequent patterns in real time. The second important feature of algorithm FTP-DS is on the regression-based compact pattern representation. Specifically, to meet the space constraint, we devise for pattern representation a compact ATF (standing for Accumulated Time and Frequency) form to aggregately comprise all the information required for regression analysis. In addition, we develop the techniques of the segmentation tuning and segment relaxation to enhance the functions of FTP-DS. With these features, algorithm FTP-DS is able to not only conduct mining with variable time intervals but also perform trend detection effectively. Synthetic data and a real dataset which contains net-Permission work alarm logs from a major telecommunication company are utilized to verify the feasibility of algorithm FTP-DS.