Mining closed frequent free trees in graph databases

Authors:
Peixiang Zhao;Jeffrey Xu Yu
Affiliations:
The Chinese University of Hong Kong, China;The Chinese University of Hong Kong, China
Venue:
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Year:
2007

Citing 10
Cited 0

Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Frequent Subgraph Discovery

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
gSpan: Graph-Based Substructure Pattern Mining

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Indexing and Mining Free Trees

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Efficient Mining of Frequent Subgraphs in the Presence of Isomorphism

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
CloseGraph: mining closed frequent graph patterns

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Frequent free tree discovery in graph data

Proceedings of the 2004 ACM symposium on Applied computing
A quickstart in frequent structure mining can make a difference

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining Closed and Maximal Frequent Subtrees from Databases of Labeled Rooted Trees

IEEE Transactions on Knowledge and Data Engineering
Fast Frequent Free Tree Mining in Graph Databases

ICDMW '06 Proceedings of the Sixth IEEE International Conference on Data Mining - Workshops

Quantified Score

Hi-index	0.00

Visualization

Abstract

Free tree, as a special graph which is connected, undirected and acyclic, has been extensively used in bioinformatics, pattern recognition, computer networks, XML databases, etc. Recent research on structural pattern mining has focused on an important problem of discovering frequent free trees in large graph databases. However, it can be prohibitive due to the presence of an exponential number of frequent free trees in the graph database. In this paper, we propose a computationally efficient algorithm that discovers only closed frequent free trees in a database of labeled graphs. A free tree t is closed if there exist no supertrees of t that has the same frequency of t. Two pruning algorithms, the safe position pruning and the safe label pruning, are proposed to efficiently detect unsatisfactory search spaces with no closed frequent free trees generated. Based on the special characteristics of free tree, the automorphism-based pruning and the canonical mapping-based pruning are introduced to facilitate the mining process. Our performance study shows that our algorithm not only reduces the number of false positives generated but also improves the mining efficiency, especially in the presence of large frequent free tree patterns in the graph database.