Relevance search in heterogeneous networks

  • Authors:
  • Chuan Shi;Xiangnan Kong;Philip S. Yu;Sihong Xie;Bin Wu

  • Affiliations:
  • Beijing University of Posts and Telecommunications, Beijing, China;University of Illinois at Chicago, IL;University of Illinois at Chicago, IL, King Abdulaziz University, Jeddah, Saudi Arabia;University of Illinois at Chicago, IL;Beijing University of Posts and Telecommunications, Beijing, China

  • Venue:
  • Proceedings of the 15th International Conference on Extending Database Technology
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Conventional research on similarity search focuses on measuring the similarity between objects with the same type. However, in many real-world applications, we need to measure the relatedness between objects with different types. For example, in automatic expert profiling, people are interested in finding the most relevant objects to an expert, where the objects can be of various types, such as research areas, conferences and papers, etc. With the surge of study on heterogeneous networks, the relatedness measure on objects with different types becomes increasingly important. In this paper, we study the relevance search problem in heterogeneous networks, where the task is to measure the relatedness of heterogeneous objects (including objects with the same type or different types). We propose a novel measure, called HeteSim, with the following attributes: (1) a path-constrained measure: the relatedness of object pairs are defined based on the search path that connect two objects through following a sequence of node types; (2) a uniform measure: it can measure the relatedness of objects with the same or different types in a uniform framework; (3) a semi-metric measure: HeteSim has some good properties (e.g., self-maximum and symmetric), that are crucial to many tasks. Empirical studies show that HeteSim can effectively evaluate the relatedness of heterogeneous objects. Moreover, in the query and clustering tasks, it can achieve better performances than conventional measures.