Ranking sequential patterns with respect to significance

  • Authors:
  • Robert Gwadera;Fabio Crestani

  • Affiliations:
  • Universita della Svizzera Italiana, Lugano, Switzerland;Universita della Svizzera Italiana, Lugano, Switzerland

  • Venue:
  • PAKDD'10 Proceedings of the 14th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining - Volume Part I
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a reliable universal method for ranking sequential patterns (itemset-sequences) with respect to significance in the problem of frequent sequential pattern mining. We approach the problem by first building a probabilistic reference model for the collection of itemset-sequences and then deriving an analytical formula for the frequency for sequential patterns in the reference model. We rank sequential patterns by computing the divergence between their actual frequencies and their frequencies in the reference model. We demonstrate the applicability of the presented method for discovering dependencies between streams of news stories in terms of significant sequential patterns, which is an important problem in multi-stream text mining and the topic detection and tracking research.