Mining frequent arrangements of temporal intervals

  • Authors:
  • Panagiotis Papapetrou;George Kollios;Stan Sclaroff;Dimitrios Gunopulos

  • Affiliations:
  • Boston University, Department of Computer Science, 02215, Boston, MA, USA;Boston University, Department of Computer Science, 02215, Boston, MA, USA;Boston University, Department of Computer Science, 02215, Boston, MA, USA;University of Athens, Department of Informatics and Telecommunications, Athens, Greece

  • Venue:
  • Knowledge and Information Systems
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The problem of discovering frequent arrangements of temporal intervals is studied. It is assumed that the database consists of sequences of events, where an event occurs during a time-interval. The goal is to mine temporal arrangements of event intervals that appear frequently in the database. The motivation of this work is the observation that in practice most events are not instantaneous but occur over a period of time and different events may occur concurrently. Thus, there are many practical applications that require mining such temporal correlations between intervals including the linguistic analysis of annotated data from American Sign Language as well as network and biological data. Three efficient methods to find frequent arrangements of temporal intervals are described; the first two are tree-based and use breadth and depth first search to mine the set of frequent arrangements, whereas the third one is prefix-based. The above methods apply efficient pruning techniques that include a set of constraints that add user-controlled focus into the mining process. Moreover, based on the extracted patterns a standard method for mining association rules is employed that applies different interestingness measures to evaluate the significance of the discovered patterns and rules. The performance of the proposed algorithms is evaluated and compared with other approaches on real (American Sign Language annotations and network data) and large synthetic datasets.