Query-driven discovery of semantically similar substructures in heterogeneous networks

  • Authors:
  • Xiao Yu;Yizhou Sun;Peixiang Zhao;Jiawei Han

  • Affiliations:
  • University of Illinois, at Urbana Champaign, Urbana, IL, USA;University of Illinois, at Urbana Champaign, Urbana, IL, USA;University of Illinois, at Urbana Champaign, Urbana, IL, USA;University of Illinois, at Urbana Champaign, Urbana, IL, USA

  • Venue:
  • Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Heterogeneous information networks that contain multiple types of objects and links are ubiquitous in the real world, such as bibliographic networks, cyber-physical networks, and social media networks. Although researchers have studied various data mining tasks in information networks, interactive query-based network exploration techniques have not been addressed systematically, which, in fact, are highly desirable for exploring large-scale information networks. In this demo, we introduce and demonstrate our recent research project on query-driven discovery of semantically similar substructures in heterogeneous networks. Given a subgraph query, our system searches a given large information network and finds efficiently a list of subgraphs that are structurally identical and semantically similar. Since data mining methods are used to obtain semantically similar entities (nodes), we use discovery as a term to describe this process. In order to achieve high efficiency and scalability, we design and implement a filter-and verification search framework, which can first generate promising subgraph candidates using off line indices built by data mining results, and then verify candidates with a recursive pruning matching process. The proposed system demonstrates the effectiveness of our query-driven semantic similarity search framework and the efficiency of the proposed methodology on multiple real-world heterogeneous information networks.