Efficient enumeration of frequent sequences
Proceedings of the seventh international conference on Information and knowledge management
ACM SIGKDD Explorations Newsletter
Introduction To Automata Theory, Languages, And Computation
Introduction To Automata Theory, Languages, And Computation
Mining sequential patterns with constraints in large databases
Proceedings of the eleventh international conference on Information and knowledge management
Mining Sequential Patterns: Generalizations and Performance Improvements
EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Inference of Sequential Association Rules Guided by Context-Free Grammars
ICGI '02 Proceedings of the 6th International Colloquium on Grammatical Inference: Algorithms and Applications
PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth
Proceedings of the 17th International Conference on Data Engineering
SPIRIT: Sequential Pattern Mining with Regular Expression Constraints
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Approximate mining of consensus sequential patterns
Approximate mining of consensus sequential patterns
Soft constraint based pattern mining
Data & Knowledge Engineering
First-order temporal pattern mining with regular expression constraints
Data & Knowledge Engineering
Mining constraint-based patterns using automatic relaxation
Intelligent Data Analysis
Tree pattern mining with tree automata constraints
Information Systems
Extending the soft constraint based mining paradigm
KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases
Constraint-Based mining of fault-tolerant patterns from boolean data
KDID'05 Proceedings of the 4th international conference on Knowledge Discovery in Inductive Databases
Hi-index | 0.00 |
The main drawbacks of sequential pattern mining have been its lack of focus on user expectations and the high number of discovered patterns. However, the solution commonly accepted – the use of constraints – approximates the mining process to a verification of what are the frequent patterns among the specified ones, instead of the discovery of unknown and unexpected patterns. In this paper, we propose a new methodology to mine sequential patterns, keeping the focus on user expectations, without compromising the discovery of unknown patterns. Our methodology is based on the use of constraint relaxations, and it consists on using them to filter accepted patterns during the mining process. We propose a hierarchy of relaxations, applied to constraints expressed as context-free languages, classifying the existing relaxations (legal, valid and naïve, previously proposed), and proposing several new classes of relaxations. The new classes range from the approx and non-accepted, to the composition of different types of relaxations, like the approx-legal or the non-prefix-valid relaxations. Finally, we present a case study that shows the results achieved with the application of this methodology to the analysis of the curricular sequences of computer science students.