Top-k subgraph matching query in a large graph

  • Authors:
  • Lei Zou;Lei Chen;Yansheng Lu

  • Affiliations:
  • Huazhong University of Science and Technology, Wuhan, China;Hong Kong University of Science and Technology, Hong Kong, Hong Kong;Huazhong University of Science and Technology, Wuhan, China

  • Venue:
  • Proceedings of the ACM first Ph.D. workshop in CIKM
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recently, due to its wide applications, subgraph search has attracted a lot of attention from database and data mining community. Sub-graph search is defined as follows: given a query graph Q, we report all data graphs containing Q in the database. However, there is little work about sub-graph search in a single large graph, which has been used in many applications, such as biological network and social network. In this paper, we address top-k sub-graph matching query problem, which is defined as follows: given a query graph Q, we locate top-k matchings of Q in a large data graph G according to a score function. The score function is defined as the sum of the pairwise similarity between a vertex in Q and its matching vertex in G. Specifically, we first design a balanced tree (that is G-Tree) to index the large data graph. Then, based on G-Tree, we propose an efficient query algorithm (that is Ranked Matching algorithm). Our extensive experiment results show that, due to efficiency of pruning strategy, given a query with up to 20 vertices, we can locate the top-100 matchings in less than 10 seconds in a large data graph with 100K vertices. Furthermore, our approach outperforms the alternative method by orders of magnitude.