Mining frequent trees based on topology projection

Authors:
Ma Haibing;Wang Chen;Li Ronglu;Liu Yong;Hu Yunfa
Affiliations:
Computer and Information Technology Department, Fudan University, Shanghai, China;Computer and Information Technology Department, Fudan University, Shanghai, China;Computer and Information Technology Department, Fudan University, Shanghai, China;Computer and Information Technology Department, Fudan University, Shanghai, China;Computer and Information Technology Department, Fudan University, Shanghai, China
Venue:
APWeb'05 Proceedings of the 7th Asia-Pacific web conference on Web Technologies Research and Development
Year:
2005

Citing 11
Cited 0

Mining association rules between sets of items in large databases

SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Mining frequent patterns without candidate generation

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth

Proceedings of the 17th International Conference on Data Engineering
H-Mine: Hyper-Structure Mining of Frequent Patterns in Large Databases

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
Fast Algorithms for Mining Association Rules in Large Databases

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Discovery of Frequent Tree Structured Patterns in Semistructured Web Documents

PAKDD '01 Proceedings of the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining
Efficiently mining frequent trees in a forest

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
gSpan: Graph-Based Substructure Pattern Mining

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Indexing and Mining Free Trees

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
XRules: an effective structural classifier for XML data

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Graph indexing: a frequent structure-based approach

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data

Quantified Score

Hi-index	0.00

Visualization

Abstract

Methods for mining frequent trees are widely used in domains like bioinformatics, web-mining, chemical compound structure mining, and so on. In this paper, we present TG, an efficient pattern growth algorithm for mining frequent embedded suttees in a forest of rooted, labeled, and ordered trees. It uses rightmost path expansion scheme to construct complete pattern growth space, and creates a projected database for every grow point of the pattern ready to grow. Then, the problem is transformed from mining frequent trees to finding frequent nodes in the projected database. We conduct detailed experiments to test its performance and scalability and find that TG outperforms TreeMiner, one of the fastest methods proposed before, by a factor of 4 to 15.