A Measure of Similarity between Graph Vertices: Applications to Synonym Extraction and Web Searching

  • Authors:
  • Vincent D. Blondel;Anahí Gajardo;Maureen Heymans;Pierre Senellart;Paul Van Dooren

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • SIAM Review
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

We introduce a concept of {similarity} between vertices of directed graphs. Let GA and GB be two directed graphs with, respectively, nA and nB vertices. We define an nB \times nA similarity matrix S whose real entry sij expresses how similar vertex j (in GA) is to vertex i (in GB): we say that sij is their similarity score. The similarity matrix can be obtained as the limit of the normalized even iterates of Sk+1 = BSkAT + BTSkA, where A and B are adjacency matrices of the graphs and S0 is a matrix whose entries are all equal to 1. In the special case where GA = GB = G, the matrix S is square and the score sij is the similarity score between the vertices i and j of G. We point out that Kleinberg's "hub and authority" method to identify web-pages relevant to a given query can be viewed as a special case of our definition in the case where one of the graphs has two vertices and a unique directed edge between them. In analogy to Kleinberg, we show that our similarity scores are given by the components of a dominant eigenvector of a nonnegative matrix. Potential applications of our similarity concept are numerous. We illustrate an application for the automatic extraction of synonyms in a monolingual dictionary.