Imputing incomplete time-series data based on varied-window similarity measure of data sequences

Authors:
Sirapat Chiewchanwattana;Chidchanok Lursinsap;Chee-Hung Henry Chu
Affiliations:
Department of Computer Science, Faculty of Science, Khon-Kaen University, Khon-Kaen 40002, Thailand;Advanced Virtual and Intelligent Computing Center (AVIC), Faculty of Science, Chulalongkorn University, Bangkok 10330, Thailand;The Center for Advanced Computer Studies (CACS), University of Louisiana at Lafayette, Lafayette, LA 70504-4330, USA
Venue:
Pattern Recognition Letters
Year:
2007

Citing 6
Cited 2

Statistical analysis with missing data

Statistical analysis with missing data
Unsupervised Optimal Fuzzy Clustering

IEEE Transactions on Pattern Analysis and Machine Intelligence
Combined learning and use for a mixture model equivalent to the RBF classifier

Neural Computation
Nonlinear time-series prediction with missing and noisy data

Neural Computation
Sinc interpolation of discrete periodic signals

IEEE Transactions on Signal Processing
Fuzzy c-means clustering of incomplete data

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

Imputing time series data by regional-gradient-guided bootstrapping algorithm

ISCIT'09 Proceedings of the 9th international conference on Communications and information technologies
Two-phase imputation with regional-gradient-guided bootstrapping algorithm and dynamics time warping for incomplete time series data

ICIC'10 Proceedings of the Advanced intelligent computing theories and applications, and 6th international conference on Intelligent computing

Quantified Score

Hi-index	0.10

Visualization

Abstract

This paper presents a pattern characterization approach for the imputation of missing samples of time-series data. The new algorithm is based on the observation that time-series data that are manifestations of natural phenomena contain several sets of similar time-series subsequences. The imputation of missing samples is achieved by finding a complete subsequence that is similar to the missing sample subsequence and imputing the missing samples from this complete subsequence. The new algorithm is tested using standard benchmark as well as real-world data sets. The experimental results showed that the imputation accuracy of the proposed algorithm, referred to as the varied-window similarity measure (VWSM) algorithm, is comparable or better than traditional methods such as: the spline interpolation, the multiple imputation (MI), and the optimal completion strategy fuzzy c-means algorithm (OCSFCM) in case of non-stationary time-series data.