Mining association rules between sets of items in large databases
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Efficiently mining long patterns from databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Detecting change in categorical data: mining contrast sets
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Fast algorithms for sorting and searching strings
SODA '97 Proceedings of the eighth annual ACM-SIAM symposium on Discrete algorithms
Multivariate discretization of continuous variables for set mining
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
On the need for time series data mining benchmarks: a survey and empirical demonstration
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
A symbolic representation of time series, with implications for streaming algorithms
DMKD '03 Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery
Data Mining for Very Busy People
Computer
On detecting differences between groups
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Fast Detection of XML Structural Similarity
IEEE Transactions on Knowledge and Data Engineering
Experiencing SAX: a novel symbolic representation of time series
Data Mining and Knowledge Discovery
Intelligent Data Analysis
COSINE: a vertical group difference approach to contrast set mining
Canadian AI'11 Proceedings of the 24th Canadian conference on Advances in artificial intelligence
GENCCS: a correlated group difference approach to contrast set mining
MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
In this work, we take the traditional notation of contrast sets and extend them to other data types, in particular time series and by extension, images. In the traditional sense, contrast-set mining identifies attributes, values and instances that differ significantly across groups, and helps user understand the differences between groups of data. We reformulate the notion of contrast-sets for time series data, and define it to be the key pattern(s) that are maximally different from the other set of data. We propose a fast and exact algorithm to find the contrast sets, and demonstrate its utility in several diverse domains, ranging from industrial to anthropology. We show that our algorithm achieves 3 orders of magnitude speedup from the brute-force algorithm, while producing exact solutions.