Refining Initial Points for K-Means Clustering
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Automatic clustering of vector time-series for manufacturing machine monitoring
ICASSP '97 Proceedings of the 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '97) -Volume 4 - Volume 4
On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration
Data Mining and Knowledge Discovery
Characteristic-Based Clustering for Time Series Data
Data Mining and Knowledge Discovery
KDD-2006 workshop report: Theory and Practice of Temporal Data Mining
ACM SIGKDD Explorations Newsletter
Hi-index | 0.00 |
This paper presents a new method that uses orthogonalized features for time series clustering and classification. To cluster or classify time series data, either original data or features extracted from the data are used as input for various clustering or classification algorithms. Our methods use features extraction to represent a time series by a fixed-dimensional vector whose components are statistical metrics. Each metric is a specific feature based on the global structure of the time series data given. However, if there are correlations between feature metrics, it could result in clustering in a distorted space. To address this, we propose to orthogonalize the space of metrics using linear correlation information to reduce the impact on the clustering from the correlations between clustering inputs. We demonstrate the orthogonal feature learning on two popular clustering algorithms, k-means and hierarchical clustering. Two benchmarking data sets are used in the experiments. The empirical results shows that our proposed orthogonal feature learning method gives a better clustering accuracy compared to all other approaches including: exhaustive feature search, without feature optimization or selection, and without feature extraction for clustering. We expect our method to enhance the feature extraction process which also serves as an improved dimension reduction resolution for time series clustering and classification.