Reducing data access latency in SDSM systems using runtime optimizations

  • Authors:
  • Javier Bueno;Xavier Martorell;Juan José Costa;Toni Cortés;Eduard Ayguadé;Guansong Zhang;Christopher Barton;Raul Silvera

  • Affiliations:
  • Universitat Politècnica de Catalunya, Barcelona, Spain;Universitat Politècnica de Catalunya, Barcelona, Spain;Universitat Politècnica de Catalunya, Barcelona, Spain;Universitat Politècnica de Catalunya, Barcelona, Spain;Universitat Politècnica de Catalunya, Barcelona, Spain;IBM Toronto Lab, Markham, ON, Canada;IBM Toronto Lab, Markham, ON, Canada;IBM Toronto Lab, Markham, ON, Canada

  • Venue:
  • Proceedings of the 2010 Conference of the Center for Advanced Studies on Collaborative Research
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Software Distributed Shared Memory (SDSM) systems offer a convenient way to run applications developed for shared memory systems on distributed systems with no changes to them. However, since SDSM systems add an extra layer of abstraction to the memory hierarchy, applications may suffer performance problems when running on top of them. Our main research interest is to develop a set of compiler and runtime system techniques that widen the range of applications that can efficiently run on SDSM systems. Currently we are targeting OpenMP applications due to the ease of use this programming model provides. In this paper we show the performance of a set of regular applications that perform well on our SDSM system. They were adapted from OpenCL codes provided by ATI, and re-written in OpenMP. When trying to exploit more complex applications with different data access patterns, we find more difficulties from a DSM system. As an example, we show the performance evaluation of the NAS MG benchmark, and two techniques we have developed to improve its data locality. Our SDSM infrastructure is composed of NanosDSM, an everything-shared SDSM developed at the Technical University of Catalonia (UPC) and the Barcelona Supercomputing Center (BSC), and the IBM XL SMP Runtime to allow the execution of the OpenMP applications.