Mining Frequent Induced Subtrees by Prefix-Tree-Projected Pattern Growth

  • Authors:
  • Lei Zou;Yansheng Lu;Huaming Zhang;Rong Hu;Chong Zhou

  • Affiliations:
  • HuaZhong University of Science and Technology, China;HuaZhong University of Science and Technology, China;University of Alabama in Huntsville, USA;HuaZhong University of Science and Technology, China;HuaZhong University of Science and Technology, China

  • Venue:
  • WAIMW '06 Proceedings of the Seventh International Conference on Web-Age Information Management Workshops
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Frequent subtree pattern mining is an important data mining problem with broad applications. Most existing algorithms, such as Apriori-like algorithms, are based on candidate-generation-and-test framework, except for Chopper and XSpanner [8]. Unfortunately, candidate pattern generation and test used in Apriori-like algorithms are always time and space consuming, and this is especially true when candidate patterns are numerous and large. To solve this problem, the technique of pattern growth was proposed by Han et al [6]. And the famous PrefixSpan algorithm was proposed for sequential pattern mining by Pei et al. in [7]. Along this line, in this paper, we propose a novel induced subtree mining algorithm, called PrefixTreeISpan (i.e. Prefix-Tree-projected Induced-Subtree pattern), which finds induced subtree patterns by growing the frequent prefix-trees. Thus, using divide and conquer, mining local length-1 frequent subtree patterns in Prefix- Tree-Projected database recursively will lead to the complete set of frequent patterns. Different from Chopper and XSpanner, PrefixTreeISpan is for mining induced subtree patterns and it does not need a checking process. Our performance study shows that PrefixTreeISpan has achieved good performance in both different large synthetic datasets and real datasets.