HOT aSAX: a novel adaptive symbolic representation for time series discords discovery

Authors:
Ninh D. Pham;Quang Loc Le;Tran Khanh Dang
Affiliations:
Faculty of Computer Science and Engineering, HCM University of Technology, Vietnam National University of HoChiMinh City, Vietnam;Faculty of Computer Science and Engineering, HCM University of Technology, Vietnam National University of HoChiMinh City, Vietnam;Faculty of Computer Science and Engineering, HCM University of Technology, Vietnam National University of HoChiMinh City, Vietnam
Venue:
ACIIDS'10 Proceedings of the Second international conference on Intelligent information and database systems: Part I
Year:
2010

Citing 4
Cited 1

Fast subsequence matching in time-series databases

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Efficient Time Series Matching by Wavelets

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
HOT SAX: Efficiently Finding the Most Unusual Time Series Subsequence

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Experiencing SAX: a novel symbolic representation of time series

Data Mining and Knowledge Discovery

Faster and parameter-free discord search in quasi-periodic time series

PAKDD'11 Proceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

Finding discords in time series database is an important problem in the last decade due to its variety of real-world applications, including data cleansing, fault diagnostics, and financial data analysis. The best known approach to our knowledge is HOT SAX technique based on the equiprobable distribution of SAX representations of time series. This characteristic, however, is not preserved in the reduced-dimensionality literature, especially on the lack of Gaussian distribution datasets. In this paper, we introduce a k-means based algorithm for symbolic representations of time series called adaptive Symbolic Aggregate approXimation (aSAX) and propose HOT aSAX algorithm for time series discords discovery. Due to the clustered characteristic of aSAX words, our algorithm produces greater pruning power than the previous approach. Our empirical experiments with real-world time series datasets confirm the theoretical analyses as well as the efficiency of our approach.