Generalization of pattern-growth methods for sequential pattern mining with gap constraints

  • Authors:
  • Cláudia Antunes;Arlindo L. Oliveira

  • Affiliations:
  • Department of Information Systems and Computer Science, Instituto Superior Técnico, INESC-ID, Lisboa, Portugal;Department of Information Systems and Computer Science, Instituto Superior Técnico, INESC-ID, Lisboa, Portugal

  • Venue:
  • MLDM'03 Proceedings of the 3rd international conference on Machine learning and data mining in pattern recognition
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

The problem of sequential pattern mining is one of the several that has deserved particular attention on the general area of data mining. Despite the important developments in the last years, the best algorithm in the area (Prefix-Span) does not deal with gap constraints and consequently doesn't allow for the introduction of background knowledge into the process. In this paper we present the generalization of the PrefixSpan algorithm to deal with gap constraints, using a new method to generate projected databases. Studies on performance and scalability were conducted in synthetic and real-life datasets, and the respective results are presented.