Mining closed episodes with simultaneous events

Authors:
Nikolaj Tatti;Boris Cule
Affiliations:
University of Antwerp, Antwerp, Belgium;University of Antwep, Antwerp, Belgium
Venue:
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2011

Citing 18
Cited 6

Combinatorial pattern discovery for scientific data: some preliminary results

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Discovery of Frequent Episodes in Event Sequences

Data Mining and Knowledge Discovery
Mining Sequential Patterns with Regular Expression Constraints

IEEE Transactions on Knowledge and Data Engineering
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Discovering Representative Episodal Association Rules from Event Sequences Using Frequent Closed Episode Sets and Event Constraints

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
TSP: Mining Top-K Closed Sequential Patterns

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
BIDE: Efficient Mining of Frequent Closed Sequences

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Constraint-based mining of episode rules and optimal window sizes

PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
Reliable detection of episodes in event sequences

Knowledge and Information Systems
Sequential Pattern Mining in Multiple Streams

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Discovering Frequent Closed Partial Orders from Strings

IEEE Transactions on Knowledge and Data Engineering
A fast algorithm for finding frequent episodes in event streams

Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Mining Frequent Itemsets in a Stream

ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Discovering Significant Patterns in Multi-stream Sequences

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Significance of Episodes Based on Minimal Windows

ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Mining Closed Strict Episodes

ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
Mining closed episodes from event sequences efficiently

PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I

EVIS: a fast and scalable episode matching engine for massively parallel data streams

DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part II
The long and the short of it: summarising event sequences with serial episodes

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
BIDE-Based parallel mining of frequent closed sequences with mapreduce

ICA3PP'12 Proceedings of the 12th international conference on Algorithms and Architectures for Parallel Processing - Volume Part II
Mining high utility episodes in complex event sequences

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Editorial: Pattern-growth based frequent serial episode discovery

Data & Knowledge Engineering
Discovering episodes with compact minimal windows

Data Mining and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sequential pattern discovery is a well-studied field in data mining. Episodes are sequential patterns describing events that often occur in the vicinity of each other. Episodes can impose restrictions to the order of the events, which makes them a versatile technique for describing complex patterns in the sequence. Most of the research on episodes deals with special cases such as serial, parallel, and injective episodes, while discovering general episodes is understudied. In this paper we extend the definition of an episode in order to be able to represent cases where events often occur simultaneously. We present an efficient and novel miner for discovering frequent and closed general episodes. Such a task presents unique challenges. Firstly, we cannot define closure based on frequency. We solve this by computing a more conservative closure that we use to reduce the search space and discover the closed episodes as a postprocessing step. Secondly, episodes are traditionally presented as directed acyclic graphs. We argue that this representation has drawbacks leading to redundancy in the output. We solve these drawbacks by defining a subset relationship in such a way that allows us to remove the redundant episodes. We demonstrate the efficiency of our algorithm and the need for using closed episodes empirically on synthetic and real-world datasets.