An introduction to Kolmogorov complexity and its applications
An introduction to Kolmogorov complexity and its applications
Global partial orders from sequential data
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Discovery of Frequent Episodes in Event Sequences
Data Mining and Knowledge Discovery
BIDE: Efficient Mining of Frequent Closed Sequences
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Reliable detection of episodes in event sequences
Knowledge and Information Systems
Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)
Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing)
Discovering Frequent Closed Partial Orders from Strings
IEEE Transactions on Knowledge and Data Engineering
Reducing the Frequent Pattern Set
ICDMW '06 Proceedings of the Sixth IEEE International Conference on Data Mining - Workshops
A fast algorithm for finding frequent episodes in event streams
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Filling in the Blanks - Krimp Minimisation for Missing Data
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
EventSummarizer: a tool for summarizing large event sequences
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Significance of Episodes Based on Minimal Windows
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Handbook of Data Compression
Inductive Databases and Constraint-Based Data Mining
Inductive Databases and Constraint-Based Data Mining
Krimp: mining itemsets that compress
Data Mining and Knowledge Discovery
Mining closed episodes with simultaneous events
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Data Mining and Knowledge Discovery
Discovering injective episodes with general partial orders
Data Mining and Knowledge Discovery
Kolmogorov's structure functions and model selection
IEEE Transactions on Information Theory
Mining high utility episodes in complex event sequences
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficiently rewriting large multimedia application execution traces with few event sequences
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Zips: mining compressing sequential patterns in streams
Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics
Finding progression stages in time-evolving event sequences
Proceedings of the 23rd international conference on World wide web
Discovering episodes with compact minimal windows
Data Mining and Knowledge Discovery
Hi-index | 0.00 |
An ideal outcome of pattern mining is a small set of informative patterns, containing no redundancy or noise, that identifies the key structure of the data at hand. Standard frequent pattern miners do not achieve this goal, as due to the pattern explosion typically very large numbers of highly redundant patterns are returned. We pursue the ideal for sequential data, by employing a pattern set mining approach - an approach where, instead of ranking patterns individually, we consider results as a whole. Pattern set mining has been successfully applied to transactional data, but has been surprisingly understudied for sequential data. In this paper, we employ the MDL principle to identify the set of sequential patterns that summarises the data best. In particular, we formalise how to encode sequential data using sets of serial episodes, and use the encoded length as a quality score. As search strategy, we propose two approaches: the first algorithm selects a good pattern set from a large candidate set, while the second is a parameter-free any-time algorithm that mines pattern sets directly from the data. Experimentation on synthetic and real data demonstrates we efficiently discover small sets of informative patterns.