Characteristic-Based Clustering for Time Series Data

Authors:
Xiaozhe Wang;Kate Smith;Rob Hyndman
Affiliations:
Faculty of Information Technology, Monash University, Clayton, Australia 3800;Faculty of Information Technology, Monash University, Clayton, Australia 3800;Department of Econometrics and Business Statistics, Monash University, Clayton, Australia 3800
Venue:
Data Mining and Knowledge Discovery
Year:
2006

Citing 0
Cited 24

Using feature-based fitness evaluation in symbolic regression with added noise

Proceedings of the 10th annual conference companion on Genetic and evolutionary computation
Cross-disciplinary perspectives on meta-learning for algorithm selection

ACM Computing Surveys (CSUR)
Periodic Pattern Analysis in Time Series Databases

DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Rule induction for forecasting method selection: Meta-learning the characteristics of univariate time series

Neurocomputing
Finding Structural Similarity in Time Series Data Using Bag-of-Patterns Representation

SSDBM 2009 Proceedings of the 21st International Conference on Scientific and Statistical Database Management
Evolving stochastic processes using feature tests and genetic programming

Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Human action recognition by feature-reduced Gaussian process classification

Pattern Recognition Letters
Similarity search in multimedia time series data using amplitude-level features

MMM'08 Proceedings of the 14th international conference on Advances in multimedia modeling
Characteristic-based descriptors for motion sequence recognition

PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Feature-based clustering for electricity use time series data

ICANNGA'09 Proceedings of the 9th international conference on Adaptive and natural computing algorithms
Exact indexing for massive time series databases under time warping distance

Data Mining and Knowledge Discovery
Orthogonal feature learning for time series clustering

ISNN'11 Proceedings of the 8th international conference on Advances in neural networks - Volume Part II
Human action recognition based on random spectral regression

AICI'11 Proceedings of the Third international conference on Artificial intelligence and computational intelligence - Volume Part III
Wavelets-based clustering of multivariate time series

Fuzzy Sets and Systems
The evolution of higher-level biochemical reaction models

Genetic Programming and Evolvable Machines
Short communication: Selective Subsequence Time Series clustering

Knowledge-Based Systems
Rotation-invariant similarity in time series using bag-of-patterns representation

Journal of Intelligent Information Systems
Feature selection for classification of oscillating time series

Expert Systems: The Journal of Knowledge Engineering
Interactive visual analysis of temporal cluster structures

EuroVis'11 Proceedings of the 13th Eurographics / IEEE - VGTC conference on Visualization
Clustering Household Electricity Use Profiles

Proceedings of Workshop on Machine Learning for Sensory Data Analysis
Stock market co-movement assessment using a three-phase clustering method

Expert Systems with Applications: An International Journal
A metric for unsupervised metalearning

Intelligent Data Analysis
Automatic selection of classification learning algorithms for data mining practitioners

Intelligent Data Analysis
Unsupervised categorization of human motion sequences

Intelligent Data Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the growing importance of time series clustering research, particularly for similarity searches amongst long time series such as those arising in medicine or finance, it is critical for us to find a way to resolve the outstanding problems that make most clustering methods impractical under certain circumstances. When the time series is very long, some clustering algorithms may fail because the very notation of similarity is dubious in high dimension space; many methods cannot handle missing data when the clustering is based on a distance metric.This paper proposes a method for clustering of time series based on their structural characteristics. Unlike other alternatives, this method does not cluster point values using a distance metric, rather it clusters based on global features extracted from the time series. The feature measures are obtained from each individual series and can be fed into arbitrary clustering algorithms, including an unsupervised neural network algorithm, self-organizing map, or hierarchal clustering algorithm.Global measures describing the time series are obtained by applying statistical operations that best capture the underlying characteristics: trend, seasonality, periodicity, serial correlation, skewness, kurtosis, chaos, nonlinearity, and self-similarity. Since the method clusters using extracted global measures, it reduces the dimensionality of the time series and is much less sensitive to missing or noisy data. We further provide a search mechanism to find the best selection from the feature set that should be used as the clustering inputs.The proposed technique has been tested using benchmark time series datasets previously reported for time series clustering and a set of time series datasets with known characteristics. The empirical results show that our approach is able to yield meaningful clusters. The resulting clusters are similar to those produced by other methods, but with some promising and interesting variations that can be intuitively explained with knowledge of the global characteristics of the time series.