Disk aware discord discovery: finding unusual time series in terabyte sized datasets
Knowledge and Information Systems
Hi-index | 0.02 |
In a recent article, Eamonn et al. [1] have introduced algorithms for the detection of most unusual time series sub-sequences. These have great implications for fast and intelligent data mining attempts using advances in modern computer technology. The techniques are used to detect unusual sub-sequences in time series arising from a wide range of applications. This paper is revisiting the algorithms introduced by the above authors and makes key improvements for a large class of time series processes by: - Objectively identifying the size of the best sliding window for which similarities and discords could be found efficiently; - Reducing the processing time by a factor equivalent to the length of the best sliding window; - Introducing an Entropy based measure as an alternative distance measure to account for outliers within specific sliding windows; - Highlighting comparisons with existing tools; - Demonstrating the new approach through applications on real life time series.