CMRules: Mining sequential rules common to several sequences

Authors:
Philippe Fournier-Viger;Usef Faghihi;Roger Nkambou;Engelbert Mephu Nguifo
Affiliations:
Department of Computer Science and Information Engineering, National Cheng Kung University, Taiwan;Department of Computer Science, University of Memphis, TN, USA;Department of Computer Science, University of Quebec at Montreal, 201, avenue Président-Kennedy, Montréal, Canada;Clermont Université, Université Blaise Pascal, LIMOS, BP 10448, F-63000 Clermont-Ferrand, France and CNRS, UMR 6158, LIMOS, F-63173 Aubiere, France
Venue:
Knowledge-Based Systems
Year:
2012

Citing 25
Cited 4

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
SPADE: an efficient algorithm for mining frequent sequences

Machine Learning
Discovery of Frequent Episodes in Event Sequences

Data Mining and Knowledge Discovery
Parallel Mining of Association Rules

IEEE Transactions on Knowledge and Data Engineering
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Maintenance of Discovered Association Rules in Large Databases: An Incremental Updating Technique

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
A Tight Upper Bound on the Number of Candidate Patterns

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Discovering Sequential Association Rules with Constraints and Time Lags in Multiple Sequences

ISMIS '02 Proceedings of the 13th International Symposium on Foundations of Intelligent Systems
Sequential PAttern mining using a bitmap representation

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach

IEEE Transactions on Knowledge and Data Engineering
A Contribution to the Use of Decision Diagrams for Loading and Mining Transaction Databases

Fundamenta Informaticae - Special issue ISMIS'05
An efficient mining of weighted frequent patterns with length decreasing support constraints

Knowledge-Based Systems
Efficient Mining of Event-Oriented Negative Sequential Rules

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 01
Mining Both Positive and Negative Impact-Oriented Sequential Rules from Transactional Data

PAKDD '09 Proceedings of the 13th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Non-redundant sequential rules-Theory and algorithm

Information Systems
Size of random Galois lattices and number of closed frequent itemsets

Discrete Applied Mathematics
Mathematics and Computation in Imaging Science and Information Processing

Mathematics and Computation in Imaging Science and Information Processing
Path-planning for autonomous training on robot manipulators in space

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Approximate weighted frequent pattern mining with/without noisy environments

Knowledge-Based Systems
Learning task models in ill-defined domain using an hybrid knowledge discovery framework

Knowledge-Based Systems
The combination of a causal and emotional learning mechanism for an improved cognitive tutoring agent

IEA/AIE'10 Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part II
Human-like learning in a conscious agent

Journal of Experimental & Theoretical Artificial Intelligence
Prediction mining – an approach to mining association rules for prediction

RSFDGrC'05 Proceedings of the 10th international conference on Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing - Volume Part II
Estimation of the density of datasets with decision diagrams

ISMIS'05 Proceedings of the 15th international conference on Foundations of Intelligent Systems
The TIMERS II algorithm for the discovery of causality

PAKDD'05 Proceedings of the 9th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining

Mining top-k sequential rules

ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part II
Mining sequential rules common to several sequences with the window size constraint

Canadian AI'12 Proceedings of the 25th Canadian conference on Advances in Artificial Intelligence
Mining association rules for the quality improvement of the production process

Expert Systems with Applications: An International Journal
Closeness Preference - A new interestingness measure for sequential rules mining

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Sequential rule mining is an important data mining task used in a wide range of applications. However, current algorithms for discovering sequential rules common to several sequences use very restrictive definitions of sequential rules, which make them unable to recognize that similar rules can describe a same phenomenon. This can have many undesirable effects such as (1) similar rules that are rated differently, (2) rules that are not found because they are considered uninteresting when taken individually, (3) and rules that are too specific, which makes them less likely to be used for making predictions. In this paper, we address these problems by proposing a more general form of sequential rules such that items in the antecedent and in the consequent of each rule are unordered. We propose an algorithm named CMRules for mining this form of rules. The algorithm proceeds by first finding association rules to prune the search space for items that occur jointly in many sequences. Then it eliminates association rules that do not meet the minimum confidence and support thresholds according to the sequential ordering. We evaluate the performance of CMRules in three different ways. First, we provide an analysis of its time complexity. Second, we compare its performance (in terms of execution time, memory usage and scalability) with an adaptation of an algorithm from the literature that we name CMDeo. For this comparison, we use three real-life public datasets, which have different characteristics and represent three kinds of data. In many cases, results show that CMRules is faster and has a better scalability for low support thresholds than CMDeo. Lastly, we report a successful application of the algorithm in a tutoring agent.