A family of dissimilarity measures between nodes generalizing both the shortest-path and the commute-time distances

  • Authors:
  • Luh Yen;Marco Saerens;Amin Mantrach;Masashi Shimbo

  • Affiliations:
  • Universite catholique de Louvain, Louvain-La-Neuve, Belgium;Universite catholique de Louvain, Louvain-La-Neuve, Belgium;Universite Libre de Bruxelles, Bruxelles, Belgium;Nara Institute of Science and Technology, Nara, Japan

  • Venue:
  • Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

This work introduces a new family of link-based dissimilarity measures between nodes of a weighted directed graph. This measure, called the randomized shortest-path (RSP) dissimilarity, depends on a parameter θ and has the interesting property of reducing, on one end, to the standard shortest-path distance when θ is large and, on the other end, to the commute-time (or resistance) distance when θ is small (near zero). Intuitively, it corresponds to the expected cost incurred by a random walker in order to reach a destination node from a starting node while maintaining a constant entropy (related to θ) spread in the graph. The parameter θ is therefore biasing gradually the simple random walk on the graph towards the shortest-path policy. By adopting a statistical physics approach and computing a sum over all the possible paths (discrete path integral), it is shown that the RSP dissimilarity from every node to a particular node of interest can be computed efficiently by solving two linear systems of n equations, where n is the number of nodes. On the other hand, the dissimilarity between every couple of nodes is obtained by inverting an n x n matrix. The proposed measure can be used for various graph mining tasks such as computing betweenness centrality, finding dense communities, etc, as shown in the experimental section.