Sequential pattern mining in multi-relational datasets

Authors:
Carlos Abreu Ferreira;João Gama;Vítor Santos Costa
Affiliations:
LIAAD, INESC LA and CRACS, INESC LA, University of Porto and ISEP, Institute of Engineering of Porto;LIAAD, INESC LA;CRACS, INESC LA, University of Porto
Venue:
CAEPIA'09 Proceedings of the Current topics in artificial intelligence, and 13th conference on Spanish association for artificial intelligence
Year:
2009

Citing 9
Cited 2

Data mining: practical machine learning tools and techniques with Java implementations

Data mining: practical machine learning tools and techniques with Java implementations
Discovery of frequent DATALOG patterns

Data Mining and Knowledge Discovery
Mining Sequential Patterns with Regular Expression Constraints

IEEE Transactions on Knowledge and Data Engineering
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth

Proceedings of the 17th International Conference on Data Engineering
Learning Bayesian networks of rules with SAYU

MRDM '05 Proceedings of the 4th international workshop on Multi-relational mining
Propositionalization-based relational subgroup discovery with RSD

Machine Learning
RUSE-WARMR: Rule Selection for Classifier Induction in Multi-relational Data-Sets

ICTAI '08 Proceedings of the 2008 20th IEEE International Conference on Tools with Artificial Intelligence - Volume 01
nFOIL: integrating Naïve Bayes and FOIL

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2

Constrained sequential pattern knowledge in multi-relational learning

EPIA'11 Proceedings of the 15th Portugese conference on Progress in artificial intelligence
Predictive sequence miner in ILP learning

ILP'11 Proceedings of the 21st international conference on Inductive Logic Programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a framework designed to mine sequential temporal patterns from multi-relational databases. In order to exploit logic-relational information without using aggregation methodologies, we convert the multi-relational dataset into what we name a multisequence database. Each example in a multi-relational target table is coded into a sequence that combines intra-table and inter-table relational temporal information. This allows us to find heterogeneous temporal patterns through standard sequence miners. Our framework is grounded in the excellent results achieved by previous propositionalization strategies. We follow a pipelined approach, where we first use a sequence miner to find frequent sequences in the multi-sequence database. Next, we select the most interesting findings to augment the representational space of the examples. The most interesting sequence patterns are discriminative and class correlated. In the final step we build a classifier model by taking an enlarged target table as input to a classifier algorithm. We evaluate the performance of this work through a motivating application, the hepatitis multi-relational dataset. We prove the effectiveness of our methodology by addressing two problems of the hepatitis dataset.