On the Stationarity of Multivariate Time Series for Correlation-Based Data Analysis

  • Authors:
  • Kiyoung Yang;Cyrus Shahabi

  • Affiliations:
  • University of Southern California;University of Southern California

  • Venue:
  • ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
  • Year:
  • 2005

Quantified Score

Hi-index 0.03

Visualization

Abstract

Multivariate time series (MTS) data sets are common in various multimedia, medical and financial application domains. These applications perform several data-analysis operations on large number of MTS data sets such as similarity searches, feature-subset-selection, clustering and classifications. Correlation-based techniques, such as Principal Component Analysis (PCA), have proven to improve the efficiency of many of the above-mentioned data-analysis operations on MTS, which implies that the correlation coefficientsconcisely represent the original MTS data. However, if the statistical properties (e.g., variance) of MTS data change over time dimension, i.e., MTS data is non-stationary, the correlation coefficients are not stable. In this paper, we propose to utilize the stationarity of the MTS data sets, in order to represent the original MTS data more stably, as well as concisely with the correlation coefficients. That is, before performing any correlation-based data analysis, we first executes the stationarity test to decide whether the MTS data is stationary or not, i.e., whether the correlation is stable or not. Subsequently, for a non-stationary MTS data set, we difference it to render the data set stationary. Even though our approach is general, to focus the discussion we describe our approach within the context of our previously proposed technique for MTS similarity search. In order to show the validity of our approach, we performed several experiments on four real-world data sets. The results show that the performance of our similarity search technique have significantly improved in terms of precision/recall.