BIDE: Efficient Mining of Frequent Closed Sequences

  • Authors:
  • Jianyong Wang;Jiawei Han

  • Affiliations:
  • -;-

  • Venue:
  • ICDE '04 Proceedings of the 20th International Conference on Data Engineering
  • Year:
  • 2004

Quantified Score

Hi-index 0.01

Visualization

Abstract

Previous studies have presented convincing argumentsthat a frequent pattern mining algorithm should not mineall frequent patterns but only the closed ones because thelatter leads to not only more compact yet complete resultset but also better efficiency. However, most of the previouslydeveloped closed pattern mining algorithms work underthe candidate maintenance-and-test paradigm which isinherently costly in both runtime and space usage when thesupport threshold is low or the patterns become long.In this paper, we present, BIDE, an efficient algorithmfor mining frequent closed sequences without candidatemaintenance. It adopts a novel sequence closure checkingscheme called BI-Directional Extension, and prunes thesearch space more deeply compared to the previous algorithmsby using the BackScan pruning method and the Scan-Skipoptimization technique. A thorough performance studywith both sparse and dense real-life data sets has demonstratedthat BIDE significantly outperforms the previous algorithms:it consumes order(s) of magnitude less memoryand can be more than an order of magnitude faster. It isalso linearly scalable in terms of database size.