A General Framework for Mining Frequent Subgraphs from Labeled Graphs

Authors:
Akihiro Inokuchi;Takashi Washio;Hiroshi Motoda
Affiliations:
Tokyo Research Laboratory IBM Japan 1623-14, Shimotsuruma, Yamato, Kanagawa, 242-8502, Japan. inokuchi@jp.ibm.com;The Institute of Scientific and Industrial Research Osaka University 8-1, Mihogaoka, Ibaraki, Osaka, 567-0047, Japan. washio@ar.sanken.osaka-u.ac.jp/ motoda@ar.sanken.osaka-u.ac.jp;The Institute of Scientific and Industrial Research Osaka University 8-1, Mihogaoka, Ibaraki, Osaka, 567-0047, Japan. washio@ar.sanken.osaka-u.ac.jp/ motoda@ar.sanken.osaka-u.ac.jp
Venue:
Fundamenta Informaticae - Advances in Mining Graphs, Trees and Sequences
Year:
2005

Citing 17
Cited 2

CLIP: concept learning from inference patterns

Artificial Intelligence - Special issue: AI research in Japan
Ordered and Unordered Tree Inclusion

SIAM Journal on Computing
Beyond market baskets: generalizing association rules to correlations

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Molecular feature mining in HIV data

Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Complete Mining of Frequent Patterns from Graphs: Mining Graph Data

Machine Learning
Feature Construction with Version Spaces for Biochemical Applications

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Frequent Subgraph Discovery

ICDM '01 Proceedings of the 2001 IEEE International Conference on Data Mining
An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data

PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Extension of Graph-Based Induction for General Graph Structured Data

PADKK '00 Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Current Issues and New Applications
Efficiently mining frequent trees in a forest

Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
gSpan: Graph-Based Substructure Pattern Mining

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
CloseGraph: mining closed frequent graph patterns

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Substructure discovery using minimum description length and background knowledge

Journal of Artificial Intelligence Research
Machine learning techniques to make computers easier to use

IJCAI'97 Proceedings of the Fifteenth international joint conference on Artifical intelligence - Volume 2
The levelwise version space algorithm and its application to molecular fragment finding

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Faster association rules for multiple relations

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2

Discovering interesting information with advances in web technology

ACM SIGKDD Explorations Newsletter
Mining frequent correlated graphs with a new measure

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

The derivation of frequent subgraphs from a dataset of labeled graphs has high computational complexity because the hard problems of isomorphism and subgraph isomorphism have to be solved as part of this derivation. To deal with this computational complexity, all previous approaches have focused on one particular kind of graph. In this paper, we propose an approach to conduct a complete search for various classes of frequent subgraphs in a massive dataset of labeled graphs within a practical time. The power of our approach comes from the algebraic representation of graphs, its associated operations and well-organized bias constraints to limit the search space efficiently. The performance has been evaluated using real world datasets, and the high scalability and flexibility of our approach have been confirmed with respect to the amount of data and the computation time.