Low depth cache-oblivious algorithms

  • Authors:
  • Guy E. Blelloch;Phillip B. Gibbons;Harsha Vardhan Simhadri

  • Affiliations:
  • Carnegie Mellon University, Pittsburgh, USA;Intel Research Pittsburgh, Pittsburgh, USA;Carnegie Mellon University, Pittsuburgh, USA

  • Venue:
  • Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we explore a simple and general approach for developing parallel algorithms that lead to good cache complexity on parallel machines with private or shared caches. The approach is to design nested-parallel algorithms that have low depth (span, critical path length) and for which the natural sequential evaluation order has low cache complexity in the cache-oblivious model. We describe several cache-oblivious algorithms with optimal work, polylogarithmic depth, and sequential cache complexities that match the best sequential algorithms, including the first such algorithms for sorting and for sparse-matrix vector multiply on matrices with good vertex separators. Using known mappings, our results lead to low cache complexities on shared-memory multiprocessors with a single level of private caches or a single shared cache. We generalize these mappings to multi-level cache hierarchies of private or shared caches, implying that our algorithms also have low cache complexities on such hierarchies. The key factor in obtaining these low parallel cache complexities is the low depth of the algorithms we propose.