Parallel PageRank computation using GPUs

  • Authors:
  • Nhat Tan Duong;Quang Anh Pham Nguyen;Anh Tu Nguyen;Huu-Duc Nguyen

  • Affiliations:
  • Hanoi University of Science and Technology, Hanoi, Vietnam;Naiscorp Information Technology Service Company, Hanoi, Vietnam;Hanoi University of Science and Technology, Hanoi, Vietnam;Hanoi University of Science and Technology, Hanoi, Vietnam

  • Venue:
  • Proceedings of the Third Symposium on Information and Communication Technology
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Fast & efficient computing of web rank scores is a necessary issue of search engines today. Because of the enormous size of data and the dynamic nature of World Wide Web, this computation is generally executed on large web graphs (to billions webpages) and requires refreshing quite often, so it becomes a challenging task. In this paper, we propose an efficient method for computing PageRank score -- a Google ranking method based on analyzing the link structure of the Web on graphics processing units (GPUs). We have employed a slightly modification of a storage data format called binary 'link structure file' which inspirited from [2] for storing the web graph data. We then divided the PageRank calculating phases into parallel operations for exploiting the computing power of the graphics cards. Our program was written in CUDA language to experiment on a system equipped two double NVIDIA GeForce GTX 295 graphics cards, using two real datasets which were crawled from Vietnamese sites containing 7 million pages, 132 million links and 15 million pages, 200 million links, respectively. The experimental results showed that the computation speed increase from 10 to 20 times when compared to a CPU Intel Q8400 at 2.67 GHz based version, on both datasets. Our method can also scale up well for larger web graphs.