PCITMiner: prefix-based closed induced tree miner for finding closed induced frequent subtrees
AusDM '07 Proceedings of the sixth Australasian conference on Data mining and analytics - Volume 70
Discovery of Useful Patterns from Tree-Structured Documents with Label-Projected Database
ATC '08 Proceedings of the 5th international conference on Autonomic and Trusted Computing
Mining maximal frequent subtrees with lists-based pattern-growth method
APWeb'08 Proceedings of the 10th Asia-Pacific web conference on Progress in WWW research and development
Authorship classification: a syntactic tree mining approach
Proceedings of the ACM SIGKDD Workshop on Useful Patterns
Model guided algorithm for mining unordered embedded subtrees
Web Intelligence and Agent Systems
Authorship classification: a discriminative syntactic tree mining approach
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Hi-index | 0.00 |
Frequent subtree pattern mining is an important data mining problem with broad applications. Most existing algorithms, such as Apriori-like algorithms, are based on candidate-generation-and-test framework, except for Chopper and XSpanner [8]. Unfortunately, candidate pattern generation and test used in Apriori-like algorithms are always time and space consuming, and this is especially true when candidate patterns are numerous and large. To solve this problem, the technique of pattern growth was proposed by Han et al [6]. And the famous PrefixSpan algorithm was proposed for sequential pattern mining by Pei et al. in [7]. Along this line, in this paper, we propose a novel induced subtree mining algorithm, called PrefixTreeISpan (i.e. Prefix-Tree-projected Induced-Subtree pattern), which finds induced subtree patterns by growing the frequent prefix-trees. Thus, using divide and conquer, mining local length-1 frequent subtree patterns in Prefix- Tree-Projected database recursively will lead to the complete set of frequent patterns. Different from Chopper and XSpanner, PrefixTreeISpan is for mining induced subtree patterns and it does not need a checking process. Our performance study shows that PrefixTreeISpan has achieved good performance in both different large synthetic datasets and real datasets.