GAIA: graph classification using evolutionary computation
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
On dense pattern mining in graph streams
Proceedings of the VLDB Endowment
Classifying graphs using theoretical metrics: a study of feasibility
DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications
Learning from graph data by putting graphs on the lattice
Expert Systems with Applications: An International Journal
Indexing and mining topological patterns for drug discovery
Proceedings of the 15th International Conference on Extending Database Technology
Semi-supervised clustering of graph objects: a subgraph mining approach
DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I
Graph classification: a diversified discriminative feature selection approach
Proceedings of the 21st ACM international conference on Information and knowledge management
Mining discriminative subgraphs from global-state networks
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Extraction of statistically significant malware behaviors
Proceedings of the 29th Annual Computer Security Applications Conference
Proceedings of the VLDB Endowment
Hi-index | 0.00 |
Graphs are being increasingly used to model a wide range of scientific data. Such widespread usage of graphs has generated considerable interest in mining patterns from graph databases. While an array of techniques exists to mine frequent patterns, we still lack a scalable approach to mine statistically significant patterns, specifically patterns with low p-values, that occur at low frequencies. We propose a highly scalable technique, called GraphSig, to mine significant subgraphs from large graph databases. We convert each graph into a set of feature vectors where each vector represents a region within the graph. Domain knowledge is used to select a meaningful feature set. Prior probabilities of features are computed empirically to evaluate statistical significance of patterns in the feature space. Following analysis in the feature space, only a small portion of the exponential search space is accessed for further analysis. This enables the use of existing frequent subgraph mining techniques to mine significant patterns in a scalable manner even when they are infrequent. Extensive experiments are carried out on the proposed techniques, and empirical results demonstrate that GraphSig is effective and efficient for mining significant patterns. To further demonstrate the power of significant patterns, we develop a classifier using patterns mined by GraphSig. Experimental results show that the proposed classifier achieves superior performance, both in terms of quality and computation cost, over state-of-the-art classifiers.