PCITMiner: prefix-based closed induced tree miner for finding closed induced frequent subtrees

Authors:
Sangeetha Kutty;Richi Nayak;Yuefeng Li
Affiliations:
Queensland University of Technology, Brisbane Qld, Australia;Queensland University of Technology, Brisbane Qld, Australia;Queensland University of Technology, Brisbane Qld, Australia
Venue:
AusDM '07 Proceedings of the sixth Australasian conference on Data mining and analytics - Volume 70
Year:
2007

Citing 12
Cited 3

Fast discovery of association rules

Advances in knowledge discovery and data mining
Efficiently mining frequent trees in a forest

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
TreeFinder: a First Step towards XML Data Mining

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Efficient Data Mining for Maximal Frequent Subtrees

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
BIDE: Efficient Mining of Frequent Closed Sequences

ICDE '04 Proceedings of the 20th International Conference on Data Engineering
HybridTreeMiner: An Efficient Algorithm for Mining Frequent Rooted Trees and Free Trees Using Canonical Forms

SSDBM '04 Proceedings of the 16th International Conference on Scientific and Statistical Database Management
Efficiently Mining Frequent Trees in a Forest: Algorithms and Applications

IEEE Transactions on Knowledge and Data Engineering
Efficient Mining of High Branching Factor Attribute Trees

ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
TRIPS and TIDES: new algorithms for tree mining

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Mining Frequent Induced Subtrees by Prefix-Tree-Projected Pattern Growth

WAIMW '06 Proceedings of the Seventh International Conference on Web-Age Information Management Workshops
Frequent Subtree Mining - An Overview

Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
PrefixTreeESpan: a pattern growth algorithm for mining embedded subtrees

WISE'06 Proceedings of the 7th international conference on Web Information Systems

Clustering XML Documents Using Closed Frequent Subtrees: A Structural Similarity Approach

Focused Access to XML Documents
HCX: an efficient hybrid clustering approach for XML documents

Proceedings of the 9th ACM symposium on Document engineering
Utilising semantic tags in XML clustering

INEX'09 Proceedings of the Focused retrieval and evaluation, and 8th international conference on Initiative for the evaluation of XML retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Frequent subtree mining has attracted a great deal of interest among the researchers due to its application in a wide variety of domains. Some of the domains include bio informatics, XML processing, computational linguistics, and web usage mining. Despite the advances in frequent subtree mining, mining for the entire frequent subtrees is infeasible due to the combinatorial explosion of the frequent subtrees with the size of the datasets. In order to provide a reduced and concise representation without information loss, we propose a novel algorithm, PCITMiner (Prefix-based Closed Induced Tree Miner). PCITMiner adopts the prefix-based pattern growth strategy to provide the closed induced frequent subtrees efficiently. The empirical analysis reveals that our algorithm significantly outperforms the current state of the art algorithm, PrefixTreeISpan(Zou, Lu, Zhang, Hu and Zhou 2006b).