Mining sequential patterns from probabilistic databases by pattern-growth

  • Authors:
  • Muhammad Muzammal

  • Affiliations:
  • Department of Computer Science, University of Leicester, UK

  • Venue:
  • BNCOD'11 Proceedings of the 28th British national conference on Advances in databases
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a pattern-growth approach for mining sequential patterns from probabilistic databases. Our considered model of uncertainty is about the situations where there is uncertainty in associating an event with a source; and consider the problem of enumerating all sequences whose expected support satisfies a user-defined threshold ?. In an earlier work [Muzammal and Raman, PAKDD'11], adapted representative candidate generate-and-test approaches, GSP (breadth-first sequence lattice traversal) and SPADE/SPAM (depth-first sequence lattice traversal) to the probabilistic case. The authors also noted the difficulties in generalizing PrefixSpan to the probabilistic case (PrefixSpan is a pattern-growth algorithm, considered to be the best performer for deterministic sequential pattern mining). We overcome these difficulties in this note and adapt PrefixSpan to work under probabilistic settings. We then report on an experimental evaluation of the candidate generateand-test approaches against the pattern-growth approach.