A web-site-based partitioning technique for reducing preprocessing overhead of parallel PageRank computation

  • Authors:
  • Ali Cevahir;Cevdet Aykanat;Ata Turk;B. Barla Cambazoglu

  • Affiliations:
  • Bilkent University, Department of Computer Engineering, Bilkent, Ankara, Turkey;Bilkent University, Department of Computer Engineering, Bilkent, Ankara, Turkey;Bilkent University, Department of Computer Engineering, Bilkent, Ankara, Turkey;Bilkent University, Department of Computer Engineering, Bilkent, Ankara, Turkey

  • Venue:
  • PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

A power method formulation, which efficiently handles the problem of dangling pages, is investigated for parallelization of PageRank computation. Hypergraph-partitioning-based sparse matrix partitioning methods can be successfully used for efficient parallelization. However, the preprocessing overhead due to hypergraph partitioning, which must be repeated often due to the evolving nature of the Web, is quite significant compared to the duration of the PageRank computation. To alleviate this problem, we utilize the information that sites form a natural clustering on pages to propose a site-based hypergraph-partitioning technique, which does not degrade the quality of the parallelization. We also propose an efficient parallelization scheme for matrix-vector multiplies in order to avoid possible communication due to the pages without in-links. Experimental results on realistic datasets validate the effectiveness of the proposed models.