Constraint relaxations for discovering unknown sequential patterns

Authors:
Cláudia Antunes;Arlindo L. Oliveira
Affiliations:
Instituto Superior Técnico / INESC-ID, Department of Information Systems and Computer Science, Lisboa, Portugal;Instituto Superior Técnico / INESC-ID, Department of Information Systems and Computer Science, Lisboa, Portugal
Venue:
KDID'04 Proceedings of the Third international conference on Knowledge Discovery in Inductive Databases
Year:
2004

Citing 9
Cited 6

Efficient enumeration of frequent sequences

Proceedings of the seventh international conference on Information and knowledge management
Is pushing constraints deeply into the mining algorithms really what we want?: an alternative approach for association rule mining

ACM SIGKDD Explorations Newsletter
Introduction To Automata Theory, Languages, And Computation

Introduction To Automata Theory, Languages, And Computation
Mining sequential patterns with constraints in large databases

Proceedings of the eleventh international conference on Information and knowledge management
Mining Sequential Patterns: Generalizations and Performance Improvements

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Inference of Sequential Association Rules Guided by Context-Free Grammars

ICGI '02 Proceedings of the 6th International Colloquium on Grammatical Inference: Algorithms and Applications
PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth

Proceedings of the 17th International Conference on Data Engineering
SPIRIT: Sequential Pattern Mining with Regular Expression Constraints

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Approximate mining of consensus sequential patterns

Approximate mining of consensus sequential patterns

Soft constraint based pattern mining

Data & Knowledge Engineering
First-order temporal pattern mining with regular expression constraints

Data & Knowledge Engineering
Mining constraint-based patterns using automatic relaxation

Intelligent Data Analysis
Tree pattern mining with tree automata constraints

Information Systems
Extending the soft constraint based mining paradigm

KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases
Constraint-Based mining of fault-tolerant patterns from boolean data

KDID'05 Proceedings of the 4th international conference on Knowledge Discovery in Inductive Databases

Quantified Score

Hi-index	0.00

Visualization

Abstract

The main drawbacks of sequential pattern mining have been its lack of focus on user expectations and the high number of discovered patterns. However, the solution commonly accepted – the use of constraints – approximates the mining process to a verification of what are the frequent patterns among the specified ones, instead of the discovery of unknown and unexpected patterns. In this paper, we propose a new methodology to mine sequential patterns, keeping the focus on user expectations, without compromising the discovery of unknown patterns. Our methodology is based on the use of constraint relaxations, and it consists on using them to filter accepted patterns during the mining process. We propose a hierarchy of relaxations, applied to constraints expressed as context-free languages, classifying the existing relaxations (legal, valid and naïve, previously proposed), and proposing several new classes of relaxations. The new classes range from the approx and non-accepted, to the composition of different types of relaxations, like the approx-legal or the non-prefix-valid relaxations. Finally, we present a case study that shows the results achieved with the application of this methodology to the analysis of the curricular sequences of computer science students.