Traps and Pitfalls of Topic-Biased PageRank

  • Authors:
  • Paolo Boldi;Roberto Posenato;Massimo Santini;Sebastiano Vigna

  • Affiliations:
  • Dipartimento di Scienze dell'Informazione, Università degli Studi di Milano, Italy;Dipartimento di Informatica, Università degli Studi di Verona, Italy;Dipartimento di Scienze dell'Informazione, Università degli Studi di Milano, Italy;Dipartimento di Scienze dell'Informazione, Università degli Studi di Milano, Italy

  • Venue:
  • Algorithms and Models for the Web-Graph
  • Year:
  • 2007

Quantified Score

Hi-index 0.01

Visualization

Abstract

We discuss a number of issues in the definition, computation and comparison of PageRank values that have been addressed sparsely in the literature, often with contradictory approaches. We study the difference between weaklyand stronglypreferential PageRank, which patch the dangling nodes with different distributions, extending analytical formulae known for the strongly preferential case, and corroborating our results with experiments on a snapshot of 100 millions of pages of the .ukdomain. The experiments show that the two PageRank versions are poorly correlated, and results about each one cannot be blindly applied to the other; moreover, our computations highlight some new concerns about the usage of exchange-based correlation indices (such as Kendall's 驴) on approximated rankings.