Finding the most unusual time series subsequence: algorithms and applications

  • Authors:
  • Eamonn Keogh;Jessica Lin;Sang-Hee Lee;Helga Van Herle

  • Affiliations:
  • University of California, Department of Computer Science and Engineering, Riverside, CA, USA;George Mason University, Department of Information and Software Engineering, Fairfax, VA, USA;University of California, Anthropology Department, Riverside, CA, USA;University of California, David Geffen School of Medicine, Los Angeles, CA, USA

  • Venue:
  • Knowledge and Information Systems
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this work we introduce the new problem of finding time seriesdiscords. Time series discords are subsequences of longer time series that are maximally different to all the rest of the time series subsequences. They thus capture the sense of the most unusual subsequence within a time series. While discords have many uses for data mining, they are particularly attractive as anomaly detectors because they only require one intuitive parameter (the length of the subsequence) unlike most anomaly detection algorithms that typically require many parameters. While the brute force algorithm to discover time series discords is quadratic in the length of the time series, we show a simple algorithm that is three to four orders of magnitude faster than brute force, while guaranteed to produce identical results. We evaluate our work with a comprehensive set of experiments on diverse data sources including electrocardiograms, space telemetry, respiration physiology, anthropological and video datasets.