All common subsequences

Authors:
Hui Wang
Affiliations:
School of Computing and Mathematics, University of Ulster, Northern Ireland, UK
Venue:
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Year:
2007

Citing 9
Cited 6

Algorithms for the Longest Common Subsequence Problem

Journal of the ACM (JACM)
Time series similarity measures (tutorial PM-2)

Tutorial notes of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Finding Similar Time Series

PKDD '97 Proceedings of the First European Symposium on Principles of Data Mining and Knowledge Discovery
Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
A Survey of Longest Common Subsequence Algorithms

SPIRE '00 Proceedings of the Seventh International Symposium on String Processing Information Retrieval (SPIRE'00)
Indexing multi-dimensional time-series with support for multiple distance measures

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Towards parameter-free data mining

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Visualizing and discovering non-trivial patterns in large time series databases

Information Visualization
A flexible and robust similarity measure based on contextual probability

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence

Algorithms for subsequence combinatorics

Theoretical Computer Science
Neighborhood counting for financial time series forecasting

CEC'09 Proceedings of the Eleventh conference on Congress on Evolutionary Computation
Counting all common subsequences to order alternatives

RSKT'07 Proceedings of the 2nd international conference on Rough sets and knowledge technology
A time weighted neighbourhood counting similarity for time series analysis

RSKT'08 Proceedings of the 3rd international conference on Rough sets and knowledge technology
Measuring tree similarity for natural language processing based information retrieval

NLDB'10 Proceedings of the Natural language processing and information systems, and 15th international conference on Applications of natural language to information systems
Lattice Machine Classification based on Contextual Probability

Fundamenta Informaticae - To Andrzej Skowron on His 70th Birthday

Quantified Score

Hi-index	0.00

Visualization

Abstract

Time series data abounds in real world problems. Measuring the similarity of time series is a key to solving these problems. One state of the art measure is the longest common subsequence. This measure advocates using the length of the longest common subsequence as an indication of similarity between sequences, but ignores information contained in the second, third..., longest subsequences. In order to capture the common information in sequences maximally we propose a novel measure of sequence similarity - the number of all common subsequences. We show that this measure satisfies the common properties of similarity functions. Calculating this measure is not trivial as a brute force approach is exponential in time. We present a novel dynamic programming algorithm to calculate this number in polynomial time. We also suggest a different way of extending a class of such measures to multidimensional, real-valued time series, in the spirit of probabilistic metric spaces. We conducted an experimental study on the new similarity measure and the extension method for classification. It was found that both the new similarity and the extension method are consistently competitive.