Classification of time series by shapelet transformation

Authors:
Jon Hills;Jason Lines;Edgaras Baranauskas;James Mapp;Anthony Bagnall
Affiliations:
University of East Anglia, Norwich, UK NR4 7TJ;University of East Anglia, Norwich, UK NR4 7TJ;University of East Anglia, Norwich, UK NR4 7TJ;University of East Anglia, Norwich, UK NR4 7TJ;University of East Anglia, Norwich, UK NR4 7TJ
Venue:
Data Mining and Knowledge Discovery
Year:
2014

Citing 22
Cited 0

Support-Vector Networks

Machine Learning
Bayesian Network Classifiers

Machine Learning - Special issue on learning with probabilistic representations
A comparison of DFT and DWT based similarity search in time-series databases

Proceedings of the ninth international conference on Information and knowledge management
Random Forests

Machine Learning
Rotation Forest: A New Classifier Ensemble Method

IEEE Transactions on Pattern Analysis and Machine Intelligence
Statistical Comparisons of Classifiers over Multiple Data Sets

The Journal of Machine Learning Research
Querying and mining of time series data: experimental comparison of representations and distance measures

Proceedings of the VLDB Endowment
Time series shapelets: a new primitive for data mining

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
The WEKA data mining software: an update

ACM SIGKDD Explorations Newsletter
Support vector machines of interval-based features for time series classification

Knowledge-Based Systems
Time series shapelets: a novel technique that allows accurate, interpretable and fast classification

Data Mining and Knowledge Discovery
Weighted dynamic time warping for time series classification

Pattern Recognition
Logical-shapelets: an expressive primitive for time series classification

Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
A likelihood ratio distance measure for the similarity between the fourier transform of time series

PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Early classification on time series

Knowledge and Information Systems
MPEG-7 visual shape descriptors

IEEE Transactions on Circuits and Systems for Video Technology
Searching and mining trillions of time series subsequences under dynamic time warping

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
A shapelet transform for time series classification

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Alternative quality measures for time series shapelets

IDEAL'12 Proceedings of the 13th international conference on Intelligent Data Engineering and Automated Learning
Human Gait Recognition and Classification Using Time Series Shapelets

ICACC '12 Proceedings of the 2012 International Conference on Advances in Computing and Communications
Fast Time Series Classification Based on Infrequent Shapelets

ICMLA '12 Proceedings of the 2012 11th International Conference on Machine Learning and Applications - Volume 01
Clustering Time Series Using Unsupervised-Shapelets

ICDM '12 Proceedings of the 2012 IEEE 12th International Conference on Data Mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Time-series classification (TSC) problems present a specific challenge for classification algorithms: how to measure similarity between series. A shapelet is a time-series subsequence that allows for TSC based on local, phase-independent similarity in shape. Shapelet-based classification uses the similarity between a shapelet and a series as a discriminatory feature. One benefit of the shapelet approach is that shapelets are comprehensible, and can offer insight into the problem domain. The original shapelet-based classifier embeds the shapelet-discovery algorithm in a decision tree, and uses information gain to assess the quality of candidates, finding a new shapelet at each node of the tree through an enumerative search. Subsequent research has focused mainly on techniques to speed up the search. We examine how best to use the shapelet primitive to construct classifiers. We propose a single-scan shapelet algorithm that finds the best $$k$$k shapelets, which are used to produce a transformed dataset, where each of the $$k$$k features represent the distance between a time series and a shapelet. The primary advantages over the embedded approach are that the transformed data can be used in conjunction with any classifier, and that there is no recursive search for shapelets. We demonstrate that the transformed data, in conjunction with more complex classifiers, gives greater accuracy than the embedded shapelet tree. We also evaluate three similarity measures that produce equivalent results to information gain in less time. Finally, we show that by conducting post-transform clustering of shapelets, we can enhance the interpretability of the transformed data. We conduct our experiments on 29 datasets: 17 from the UCR repository, and 12 we provide ourselves.