Group SAX: extending the notion of contrast sets to time series and multimedia data

Authors:
Jessica Lin;Eamonn Keogh
Affiliations:
Information and Software Engineering, George Mason University;Department of Computer Science & Engineering, University of California, Riverside
Venue:
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Year:
2006

Citing 10
Cited 5

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Efficiently mining long patterns from databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Detecting change in categorical data: mining contrast sets

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Fast algorithms for sorting and searching strings

SODA '97 Proceedings of the eighth annual ACM-SIAM symposium on Discrete algorithms
Multivariate discretization of continuous variables for set mining

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
On the need for time series data mining benchmarks: a survey and empirical demonstration

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
A symbolic representation of time series, with implications for streaming algorithms

DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Data Mining for Very Busy People

Computer
On detecting differences between groups

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Fast Detection of XML Structural Similarity

IEEE Transactions on Knowledge and Data Engineering

Experiencing SAX: a novel symbolic representation of time series

Data Mining and Knowledge Discovery
Rules for contrast sets

Intelligent Data Analysis
COSINE: a vertical group difference approach to contrast set mining

Canadian AI'11 Proceedings of the 24th Canadian conference on Advances in artificial intelligence
GENCCS: a correlated group difference approach to contrast set mining

MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
Similarity measure based on piecewise linear approximation and derivative dynamic time warping for time series mining

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this work, we take the traditional notation of contrast sets and extend them to other data types, in particular time series and by extension, images. In the traditional sense, contrast-set mining identifies attributes, values and instances that differ significantly across groups, and helps user understand the differences between groups of data. We reformulate the notion of contrast-sets for time series data, and define it to be the key pattern(s) that are maximally different from the other set of data. We propose a fast and exact algorithm to find the contrast sets, and demonstrate its utility in several diverse domains, ranging from industrial to anthropology. We show that our algorithm achieves 3 orders of magnitude speedup from the brute-force algorithm, while producing exact solutions.