Efficient Periodicity Mining in Time Series Databases Using Suffix Trees

Authors:
Faras Rasheed;Mohammed Alshalalfa;Reda Alhajj
Affiliations:
University of Calgary, Calgary;University of Calgary, Calgary;University of Calgary, Calgary
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
2011

Citing 0
Cited 8

New and efficient knowledge discovery of partial periodic patterns with multiple minimum supports

Journal of Systems and Software
Finding longest approximate periodic patterns

WADS'11 Proceedings of the 12th international conference on Algorithms and data structures
ERA: efficient serial and parallel suffix tree construction for very long strings

Proceedings of the VLDB Endowment
Granular-based partial periodic pattern discovery over time series data

RSKT'11 Proceedings of the 6th international conference on Rough sets and knowledge technology
Mining popular patterns from transactional databases

DaWaK'12 Proceedings of the 14th international conference on Data Warehousing and Knowledge Discovery
Detecting approximate periodic patterns

MedAlg'12 Proceedings of the First Mediterranean conference on Design and Analysis of Algorithms
Effective periodic pattern mining in time series databases

Expert Systems with Applications: An International Journal
Periodic pattern analysis of non-uniformly sampled stock market data

Intelligent Data Analysis

Quantified Score

Hi-index	0.01

Visualization

Abstract

Periodic pattern mining or periodicity detection has a number of applications, such as prediction, forecasting, detection of unusual activities, etc. The problem is not trivial because the data to be analyzed are mostly noisy and different periodicity types (namely symbol, sequence, and segment) are to be investigated. Accordingly, we argue that there is a need for a comprehensive approach capable of analyzing the whole time series or in a subsection of it to effectively handle different types of noise (to a certain degree) and at the same time is able to detect different types of periodic patterns; combining these under one umbrella is by itself a challenge. In this paper, we present an algorithm which can detect symbol, sequence (partial), and segment (full cycle) periodicity in time series. The algorithm uses suffix tree as the underlying data structure; this allows us to design the algorithm such that its worst-case complexity is O(k . n^2), where k is the maximum length of periodic pattern and n is the length of the analyzed portion (whole or subsection) of the time series. The algorithm is noise resilient; it has been successfully demonstrated to work with replacement, insertion, deletion, or a mixture of these types of noise. We have tested the proposed algorithm on both synthetic and real data from different domains, including protein sequences. The conducted comparative study demonstrate the applicability and effectiveness of the proposed algorithm; it is generally more time-efficient and noise-resilient than existing algorithms.