PrIter: a distributed framework for prioritized iterative computations

  • Authors:
  • Yanfeng Zhang;Qixin Gao;Lixin Gao;Cuirong Wang

  • Affiliations:
  • Northeastern University, China and University of Massachusetts Amherst;Northeastern University, China;University of Massachusetts Amherst;Northeastern University, China

  • Venue:
  • Proceedings of the 2nd ACM Symposium on Cloud Computing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Iterative computations are pervasive among data analysis applications in the cloud, including Web search, online social network analysis, recommendation systems, and so on. These cloud applications typically involve data sets of massive scale. Fast convergence of the iterative computation on the massive data set is essential for these applications. In this paper, we explore the opportunity for accelerating iterative computations and propose a distributed computing framework, PrIter, which enables fast iterative computation by providing the support of prioritized iteration. Instead of performing computations on all data records without discrimination, PrIter prioritizes the computations that help convergence the most, so that the convergence speed of iterative process is significantly improved. We evaluate PrIter on a local cluster of machines as well as on Amazon EC2 Cloud. The results show that PrIter achieves up to 50x speedup over Hadoop for a series of iterative algorithms.