Imputing incomplete time-series data based on varied-window similarity measure of data sequences

  • Authors:
  • Sirapat Chiewchanwattana;Chidchanok Lursinsap;Chee-Hung Henry Chu

  • Affiliations:
  • Department of Computer Science, Faculty of Science, Khon-Kaen University, Khon-Kaen 40002, Thailand;Advanced Virtual and Intelligent Computing Center (AVIC), Faculty of Science, Chulalongkorn University, Bangkok 10330, Thailand;The Center for Advanced Computer Studies (CACS), University of Louisiana at Lafayette, Lafayette, LA 70504-4330, USA

  • Venue:
  • Pattern Recognition Letters
  • Year:
  • 2007

Quantified Score

Hi-index 0.10

Visualization

Abstract

This paper presents a pattern characterization approach for the imputation of missing samples of time-series data. The new algorithm is based on the observation that time-series data that are manifestations of natural phenomena contain several sets of similar time-series subsequences. The imputation of missing samples is achieved by finding a complete subsequence that is similar to the missing sample subsequence and imputing the missing samples from this complete subsequence. The new algorithm is tested using standard benchmark as well as real-world data sets. The experimental results showed that the imputation accuracy of the proposed algorithm, referred to as the varied-window similarity measure (VWSM) algorithm, is comparable or better than traditional methods such as: the spline interpolation, the multiple imputation (MI), and the optimal completion strategy fuzzy c-means algorithm (OCSFCM) in case of non-stationary time-series data.