Principles and practice of information theory
Principles and practice of information theory
Finding interesting rules from large sets of discovered association rules
CIKM '94 Proceedings of the third international conference on Information and knowledge management
Combinatorial pattern discovery for scientific data: some preliminary results
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Dynamic itemset counting and implication rules for market basket data
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Beyond market baskets: generalizing association rules to correlations
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Finding patterns in time series: a dynamic programming approach
Advances in knowledge discovery and data mining
Supporting fast search in time series for movement patterns in multiple scales
Proceedings of the seventh international conference on Information and knowledge management
Event detection from time series data
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Pruning and summarizing the discovered associations
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Identifying distinctive subsequences in multivariate time series by clustering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Interestingness via what is not interesting
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining association rules with multiple minimum supports
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Prediction with local patterns using cross-entropy
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining frequent patterns without candidate generation
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Generating non-redundant association rules
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Small is beautiful: discovering the minimal set of unexpected patterns
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Deformable Markov model templates for time-series pattern matching
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Global partial orders from sequential data
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Multi-level organization and summarization of the discovered rules
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining asynchronous periodic patterns in time series data
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
FreeSpan: frequent pattern-projected sequential pattern mining
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Sequence mining in categorical domains: incorporating constraints
Proceedings of the ninth international conference on Information and knowledge management
SPADE: an efficient algorithm for mining frequent sequences
Machine Learning
Discovery of Frequent Episodes in Event Sequences
Data Mining and Knowledge Discovery
What Makes Patterns Interesting in Knowledge Discovery Systems
IEEE Transactions on Knowledge and Data Engineering
Discovering Frequent Event Patterns with Multiple Granularities in Time Sequences
IEEE Transactions on Knowledge and Data Engineering
Mining Sequential Patterns: Generalizations and Performance Improvements
EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Efficient Retrieval of Similar Time Sequences Under Time Warping
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Discovering All Most Specific Sentences by Randomized Algorithms
ICDT '97 Proceedings of the 6th International Conference on Database Theory
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth
Proceedings of the 17th International Conference on Data Engineering
PKDD '97 Proceedings of the First European Symposium on Principles of Data Mining and Knowledge Discovery
Managing Interesting Rules in Sequence Mining
PKDD '99 Proceedings of the Third European Conference on Principles of Data Mining and Knowledge Discovery
On the Discovery of Interesting Patterns in Association Rules
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Mining Surprising Patterns Using Temporal Description Length
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
SPIRIT: Sequential Pattern Mining with Regular Expression Constraints
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Mining Frequent Itemsets Using Support Constraints
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Efficient Mining of Statistical Dependencies
IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
On Similarity-Based Queries for Time Series Data
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Efficient Mining of Partial Periodic Patterns in Time Series Database
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Finding Interesting Associations without Support Pruning
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Dynamic Miss-Counting Algorithms: Finding Implication and Similarity Rules with Confidence Pruning
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Mining unexpected multidimensional rules
Proceedings of the ACM tenth international workshop on Data warehousing and OLAP
Finding anomalous periodic time series
Machine Learning
Mining periodic behaviors of object movements for animal and biological sustainability studies
Data Mining and Knowledge Discovery
Discovery of Periodic Patterns in Sequence Data: A Variance-Based Approach
INFORMS Journal on Computing
Hi-index | 0.00 |
In this paper, we focus on mining surprising periodic patterns in a sequence of events. In many applications, e.g., computational biology, an infrequent pattern is still considered very significant if its actual occurrence frequency exceeds the prior expectation by a large margin. The traditional metric, such as support, is not necessarily the ideal model to measure this kind of surprising patterns because it treats all patterns equally in the sense that every occurrence carries the same weight towards the assessment of the significance of a pattern regardless of the probability of occurrence. A more suitable measurement, information, is introduced to naturally value the degree of surprise of each occurrence of a pattern as a continuous and monotonically decreasing function of its probability of occurrence. This would allow patterns with vastly different occurrence probabilities to be handled seamlessly. As the accumulated degree of surprise of all repetitions of a pattern, the concept of information gain is proposed to measure the overall degree of surprise of the pattern within a data sequence. The bounded information gain property is identified to tackle the predicament caused by the violation of the downward closure property by the information gain measure and in turn provides an efficient solution to this problem. Furthermore, the user has a choice between specifying a minimum information gain threshold and choosing the number of surprising patterns wanted. Empirical tests demonstrate the efficiency and the usefulness of the proposed model.