Impact of Reverse Computing on Information Locality in Register Allocation for High Performance Computing

  • Authors:
  • Mouad Bahi;Christine Eisenbeis

  • Affiliations:
  • INRIA Saclay Île-de-France, Orsay, France and LRI, Université de Paris-Sud 11, Orsay, France and LERMA, Observatoire de Paris, Paris, France;INRIA Saclay Île-de-France, Orsay, France and LRI, Université de Paris-Sud 11, Orsay, France

  • Venue:
  • International Journal of Parallel Programming
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

Reversible computing aims at keeping all information on input and intermediate values available at any step of the computation, making information virtually present everywhere. Rematerialization in register allocation amounts to recomputing values instead of spilling them in memory when registers run out. In this paper we detail a heuristic algorithm for exploiting reverse computing for register materialization. This improves information locality as it provides more opportunities for retrieving data. Rematerialization adds instructions and we show on one specifically designed example that reverse computing may alleviate the impact of these additional instructions on performance. We also show how thread parallelism may be optimized on GPUs by performing register allocation with reverse recomputing that increases the number of threads per Streaming Multiprocessor. This is done on the main kernel of Lattice Quantum Chromo Dynamics simulation program where we gain a 11 % speedup.