A parameterizable enumeration algorithm for sequence mining

Authors:
J. David;L. Nourine
Affiliations:
-;-
Venue:
Theoretical Computer Science
Year:
2013

Citing 10
Cited 0

On generating all maximal independent sets

Information Processing Letters
Discovery of Frequent Episodes in Event Sequences

Data Mining and Knowledge Discovery
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Efficient Mining of Frequent Subgraphs in the Presence of Isomorphism

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Bases of Motifs for Generating Repeated Patterns with Wild Cards

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Shuffling biological sequences with motif constraints

Journal of Discrete Algorithms
Sequence Mining Automata: A New Technique for Mining Frequent Sequences under Regular Expressions

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Enumeration aspects of maximal cliques and bicliques

Discrete Applied Mathematics
An efficient polynomial delay algorithm for pseudo frequent itemset mining

DS'07 Proceedings of the 10th international conference on Discovery science
The average complexity of Moore's state minimization algorithm is O(n log log n)

MFCS'10 Proceedings of the 35th international conference on Mathematical foundations of computer science

Quantified Score

Hi-index	5.23

Visualization

Abstract

In this paper, we introduce an generic framework for the mining of sequences under various constraints. More precisely, we study the enumeration of all partitions of a word w into multisets of subsequences. We show that using additional predicates, this generator can be used for frequent subsequences and substrings mining. We define the transition graph T"w whose vertices are multisets of words and arcs are transitions between multisets. We show that T"w is a directed acyclic graph and it admits a covering tree. We use T"w to propose a generic algorithm that enumerates all multisets that satisfies a set of predicates, without redundancy.