Multi-Dimensional Relational Sequence Mining

  • Authors:
  • Floriana Esposito;Nicola Di Mauro;Teresa M.A. Basile;Stefano Ferilli

  • Affiliations:
  • (Correspd.) Università degli Studi di Bari, Dipartimento di Informatica, 70125 Bari, Italy. {esposito,ndm,basile,ferilli}@di.uniba.it;Università degli Studi di Bari, Dipartimento di Informatica, 70125 Bari, Italy. {esposito,ndm,basile,ferilli}@di.uniba.it;Università degli Studi di Bari, Dipartimento di Informatica, 70125 Bari, Italy. {esposito,ndm,basile,ferilli}@di.uniba.it;Università degli Studi di Bari, Dipartimento di Informatica, 70125 Bari, Italy. {esposito,ndm,basile,ferilli}@di.uniba.it

  • Venue:
  • Fundamenta Informaticae - Progress on Multi-Relational Data Mining
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The issue addressed in this paper concerns the discovery of frequent multi-dimensional patterns from relational sequences. The great variety of applications of sequential pattern mining, such as user profiling, medicine, local weather forecast and bioinformatics, makes this problem one of the central topics in data mining. Nevertheless, sequential information may concern data on multiple dimensions and, hence, the mining of sequential patterns from multi-dimensional information results very important. In a multi-dimensional sequence each event depends on more than one dimension, such as in spatio-temporal sequences where an event may be spatially or temporally related to other events. In literature, the multi-relational data mining approach has been successfully applied to knowledge discovery fromcomplex data. However, there exists no contribution to manage the general case of multi-dimensional data in which, for example, spatial and temporal information may co-exist. This work takes into account the possibility to mine complex patterns, expressed in a first-order language, in which events may occur along different dimensions. Specifically, multidimensional patterns are defined as a set of atomic first-order formulae in which events are explicitly represented by a variable and the relations between events are represented by a set of dimensional predicates. A complete framework and an Inductive Logic Programming algorithm to tackle this problem are presented along with some experiments on artificial and real multi-dimensional sequences proving its effectiveness.