Resource oblivious sorting on multicores

  • Authors:
  • Richard Cole;Vijaya Ramachandran

  • Affiliations:
  • Computer Science Dept., Courant Institute, NYU, New York, NY;Dept. of Computer Science, Univ. of Texas, Austin, TX

  • Venue:
  • ICALP'10 Proceedings of the 37th international colloquium conference on Automata, languages and programming
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a new deterministic sorting algorithm that interleaves the partitioning of a sample sort with merging. Sequentially, it sorts n elements in O(n log n) time cache-obliviously with an optimal number of cache misses. The parallel complexity (or critical path length) of the algorithm is O(log n log log n), which improves on previous bounds for deterministic sample sort. Given a multicore computing environment with a global shared memory and p cores, each having a cache of size M organized in blocks of size B, our algorithm can be scheduled effectively on these p cores in a cache-oblivious manner. We improve on the above cache-oblivious processor-aware parallel implementation by using the Priority Work Stealing Scheduler (PWS) that we presented recently in a companion paper [12]. The PWS scheduler is both processor- and cache-oblivious (i.e., resource oblivious), and it tolerates asynchrony among the cores. Using PWS, we obtain a resource oblivious scheduling of our sorting algorithm that matches the performance of the processor-aware version. Our analysis includes the delay incurred by false-sharing. We also establish good bounds for our algorithm with the randomized work stealing scheduler.