Balanced, locality-based parallel irregular reductions

  • Authors:
  • Eladio Gutiérrez;Oscar Plata;Emilio L. Zapata

  • Affiliations:
  • Department of Computer Architecture, University of Málaga, Málaga, Spain;Department of Computer Architecture, University of Málaga, Málaga, Spain;Department of Computer Architecture, University of Málaga, Málaga, Spain

  • Venue:
  • LCPC'01 Proceedings of the 14th international conference on Languages and compilers for parallel computing
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Much effort has been devoted recently to efficiently parallelize irregular reductions. Different parallelization techniques have been proposed during the last years that can be classified into two groups: LPO (Loop Partitioning Oriented methods) and DPO (Data Partitioning Oriented methods). We have analyzed both classes in terms of a set of performance aspects: data locality, memory overhead, parallelism and workload balancing. Load balancing is not an issue sufficiently analyzed in the literature in parallel reduction methods, specially those in the DPO class. In this paper we propose two techniques to introduce load balancing into a DPO method. The first technique is generic, as it can deal with any kind of load unbalancing present in the problem domain. The second technique handles a special case of load unbalancing, appearing when there are a large number of write operations on small regions of the reduction arrays. Efficient implementations of the proposed solutions to load balancing for an example DPO method are presented. Experiments on static and dynamic kernel codes were conducted making comparisons with other parallel reduction methods.