HOT aSAX: a novel adaptive symbolic representation for time series discords discovery

  • Authors:
  • Ninh D. Pham;Quang Loc Le;Tran Khanh Dang

  • Affiliations:
  • Faculty of Computer Science and Engineering, HCM University of Technology, Vietnam National University of HoChiMinh City, Vietnam;Faculty of Computer Science and Engineering, HCM University of Technology, Vietnam National University of HoChiMinh City, Vietnam;Faculty of Computer Science and Engineering, HCM University of Technology, Vietnam National University of HoChiMinh City, Vietnam

  • Venue:
  • ACIIDS'10 Proceedings of the Second international conference on Intelligent information and database systems: Part I
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Finding discords in time series database is an important problem in the last decade due to its variety of real-world applications, including data cleansing, fault diagnostics, and financial data analysis. The best known approach to our knowledge is HOT SAX technique based on the equiprobable distribution of SAX representations of time series. This characteristic, however, is not preserved in the reduced-dimensionality literature, especially on the lack of Gaussian distribution datasets. In this paper, we introduce a k-means based algorithm for symbolic representations of time series called adaptive Symbolic Aggregate approXimation (aSAX) and propose HOT aSAX algorithm for time series discords discovery. Due to the clustered characteristic of aSAX words, our algorithm produces greater pruning power than the previous approach. Our empirical experiments with real-world time series datasets confirm the theoretical analyses as well as the efficiency of our approach.