Combinatorial pattern discovery for scientific data: some preliminary results
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Discovery of Frequent Episodes in Event Sequences
Data Mining and Knowledge Discovery
Mining Sequential Patterns with Regular Expression Constraints
IEEE Transactions on Knowledge and Data Engineering
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
TSP: Mining Top-K Closed Sequential Patterns
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
BIDE: Efficient Mining of Frequent Closed Sequences
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Constraint-based mining of episode rules and optimal window sizes
PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
Reliable detection of episodes in event sequences
Knowledge and Information Systems
Sequential Pattern Mining in Multiple Streams
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Discovering Frequent Closed Partial Orders from Strings
IEEE Transactions on Knowledge and Data Engineering
A fast algorithm for finding frequent episodes in event streams
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining Frequent Itemsets in a Stream
ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Discovering Significant Patterns in Multi-stream Sequences
ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Significance of Episodes Based on Minimal Windows
ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
Mining closed episodes from event sequences efficiently
PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
EVIS: a fast and scalable episode matching engine for massively parallel data streams
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part II
The long and the short of it: summarising event sequences with serial episodes
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
BIDE-Based parallel mining of frequent closed sequences with mapreduce
ICA3PP'12 Proceedings of the 12th international conference on Algorithms and Architectures for Parallel Processing - Volume Part II
Mining high utility episodes in complex event sequences
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Editorial: Pattern-growth based frequent serial episode discovery
Data & Knowledge Engineering
Discovering episodes with compact minimal windows
Data Mining and Knowledge Discovery
Hi-index | 0.00 |
Sequential pattern discovery is a well-studied field in data mining. Episodes are sequential patterns describing events that often occur in the vicinity of each other. Episodes can impose restrictions to the order of the events, which makes them a versatile technique for describing complex patterns in the sequence. Most of the research on episodes deals with special cases such as serial, parallel, and injective episodes, while discovering general episodes is understudied. In this paper we extend the definition of an episode in order to be able to represent cases where events often occur simultaneously. We present an efficient and novel miner for discovering frequent and closed general episodes. Such a task presents unique challenges. Firstly, we cannot define closure based on frequency. We solve this by computing a more conservative closure that we use to reduce the search space and discover the closed episodes as a postprocessing step. Secondly, episodes are traditionally presented as directed acyclic graphs. We argue that this representation has drawbacks leading to redundancy in the output. We solve these drawbacks by defining a subset relationship in such a way that allows us to remove the redundant episodes. We demonstrate the efficiency of our algorithm and the need for using closed episodes empirically on synthetic and real-world datasets.