Finding a maximum clique in an arbitrary graph
SIAM Journal on Computing
A graph distance metric based on the maximal common subgraph
Pattern Recognition Letters
Algorithmics and applications of tree and graph searching
Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Annealed replication: a new heuristic for the maximum clique problem
Discrete Applied Mathematics
Graph indexing: a frequent structure-based approach
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Cyclic pattern kernels for predictive graph mining
Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Common subgraph isomorphism detection by backtracking search
Software—Practice & Experience
Substructure similarity search in graph databases
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Closure-Tree: An Index Structure for Graph Queries
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Fg-index: towards verification-free query processing on graph databases
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Towards graph containment search and indexing
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
A novel spectral coding in a large graph database
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Taming verification hardness: an efficient algorithm for testing subgraph isomorphism
Proceedings of the VLDB Endowment
A novel approach for efficient supergraph query processing on graph databases
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
G-hash: towards fast kernel-based similarity search in large graph databases
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Connected substructure similarity search
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
iGraph: a framework for comparisons of disk-based graph indexing techniques
Proceedings of the VLDB Endowment
Fast graph query processing with a low-cost index
The VLDB Journal — The International Journal on Very Large Data Bases
A new approach and faster exact methods for the maximum common subgraph problem
COCOON'05 Proceedings of the 11th annual international conference on Computing and Combinatorics
Hi-index | 0.00 |
Querying similar graphs in graph databases has been widely studied in graph query processing in recent years. Existing works mainly focus on subgraph similarity search and supergraph similarity search. In this paper, we study the problem of finding top-k graphs in a graph database that are most similar to a query graph. This problem has many applications, such as image retrieval and chemical compound structure search. Regarding the similarity measure, feature based and kernel based similarity measures have been used in the literature. But such measures are rough and may lose the connectivity information among substructures. In this paper, we introduce a new similarity measure based on the maximum common subgraph (MCS) of two graphs. We show that this measure can better capture the common and different structures of two graphs. Since computing the MCS of two graphs is NP-hard, we propose an algorithm to answer the top-k graph similarity query using two distance lower bounds with different computational costs, in order to reduce the number of MCS computations. We further introduce an indexing technique, which can better make use of the triangle property of similarities among graphs in the database to get tighter lower bounds. Three different indexing methods are proposed with different tradeoffs between pruning power and construction cost. We conducted extensive performance studies on large real datasets to evaluate the performance of our approaches.