PCITMiner: prefix-based closed induced tree miner for finding closed induced frequent subtrees

  • Authors:
  • Sangeetha Kutty;Richi Nayak;Yuefeng Li

  • Affiliations:
  • Queensland University of Technology, Brisbane Qld, Australia;Queensland University of Technology, Brisbane Qld, Australia;Queensland University of Technology, Brisbane Qld, Australia

  • Venue:
  • AusDM '07 Proceedings of the sixth Australasian conference on Data mining and analytics - Volume 70
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Frequent subtree mining has attracted a great deal of interest among the researchers due to its application in a wide variety of domains. Some of the domains include bio informatics, XML processing, computational linguistics, and web usage mining. Despite the advances in frequent subtree mining, mining for the entire frequent subtrees is infeasible due to the combinatorial explosion of the frequent subtrees with the size of the datasets. In order to provide a reduced and concise representation without information loss, we propose a novel algorithm, PCITMiner (Prefix-based Closed Induced Tree Miner). PCITMiner adopts the prefix-based pattern growth strategy to provide the closed induced frequent subtrees efficiently. The empirical analysis reveals that our algorithm significantly outperforms the current state of the art algorithm, PrefixTreeISpan(Zou, Lu, Zhang, Hu and Zhou 2006b).