Clustering graphs by weighted substructure mining

Authors:
Koji Tsuda;Taku Kudo
Affiliations:
Max Planck Institute for Biological Cybernetics, Tübingen, Germany;Google Japan Inc., Sakuragaoka-cho, Shibuya-ku, Tokyo, Japan
Venue:
ICML '06 Proceedings of the 23rd international conference on Machine learning
Year:
2006

Citing 4
Cited 12

gSpan: Graph-Based Substructure Pattern Mining

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
CloseGraph: mining closed frequent graph patterns

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining Generalized Substructures from a Set of Labeled Graphs

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Pattern Vectors from Algebraic Graph Theory

IEEE Transactions on Pattern Analysis and Machine Intelligence

Learning from interpretations: a rooted kernel for ordered hypergraphs

Proceedings of the 24th international conference on Machine learning
Partial least squares regression for graph mining

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Graph self-organizing maps for cyclic and unbounded graphs

Neurocomputing
Maximum margin clustering made practical

IEEE Transactions on Neural Networks
Data clustering: 50 years beyond K-means

Pattern Recognition Letters
Time and space efficient discovery of maximal geometric graphs

DS'07 Proceedings of the 10th international conference on Discovery science
Efficient algorithms for mining frequent and closed patterns from semi-structured data

PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Online structural graph clustering using frequent subgraph mining

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
Parallel structural graph clustering

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Multi-agent adaptive boosting on semi-supervised water supply clusters

Advances in Engineering Software
Substructure clustering: a novel mining paradigm for arbitrary data types

SSDBM'12 Proceedings of the 24th international conference on Scientific and Statistical Database Management
Predicting the social influence of upcoming contents in large social networks

Proceedings of the Fifth International Conference on Internet Multimedia Computing and Service

Quantified Score

Hi-index	0.00

Visualization

Abstract

Graph data is getting increasingly popular in, e.g., bioinformatics and text processing. A main difficulty of graph data processing lies in the intrinsic high dimensionality of graphs, namely, when a graph is represented as a binary feature vector of indicators of all possible subgraphs, the dimensionality gets too large for usual statistical methods. We propose an efficient method for learning a binomial mixture model in this feature space. Combining the l1 regularizer and the data structure called DFS code tree, the MAP estimate of non-zero parameters are computed efficiently by means of the EM algorithm. Our method is applied to the clustering of RNA graphs, and is compared favorably with graph kernels and the spectral graph distance.