Piccolo: building fast, distributed programs with partitioned tables

  • Authors:
  • Russell Power;Jinyang Li

  • Affiliations:
  • New York University;New York University

  • Venue:
  • OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Piccolo is a new data-centric programming model for writing parallel in-memory applications in data centers. Unlike existing data-flow models, Piccolo allows computation running on different machines to share distributed, mutable state via a key-value table interface. Piccolo enables efficient application implementations. In particular, applications can specify locality policies to exploit the locality of shared state access and Piccolo's run-time automatically resolves write-write conflicts using user-defined accumulation functions. Using Piccolo, we have implemented applications for several problem domains, including the PageRank algorithm, k-means clustering and a distributed crawler. Experiments using 100 Amazon EC2 instances and a 12 machine cluster show Piccolo to be faster than existing data flow models for many problems, while providing similar fault-tolerance guarantees and a convenient programming interface.