Subspace sums for extracting non-random data from massive noise
Knowledge and Information Systems
Data mining of vector–item patterns using neighborhood histograms
Knowledge and Information Systems
A new multiobjective clustering technique based on the concepts of stability and symmetry
Knowledge and Information Systems
Short communication: Selective Subsequence Time Series clustering
Knowledge-Based Systems
A methodological approach to mining and simulating data in complex information systems
Intelligent Data Analysis
Hi-index | 0.00 |
Clustering of time series subsequence data commonly produces results that are unspecific to the data set. This paper introduces a clustering algorithm, that creates clusters exclusively from those subsequences that occur more frequently in a data set than would be expected by random chance. As such, it partially adopts a pattern mining perspective into clustering. When subsequences are being labeled based on such clusters, they may remain without label. In fact, if the clustering was done on an unrelated time series it is expected that the subsequences should not receive a label. We show that pattern-based clusters are indeed specific to the data set for 7 out of 10 real-world sets we tested, and for window-lengths up to 128 time points. While kernel-density-based clustering can be used to find clusters with similar properties for window sizes of 8–16 time points, its performance degrades fast for increasing window sizes.