Scalable memory registration for high performance networks using helper threads

  • Authors:
  • Dong Li;Kirk W. Cameron;Dimitrios S. Nikolopoulos;Bronis R. de Supinski;Martin Schulz

  • Affiliations:
  • Virginia Tech;Virginia Tech;FORTH-ICS and University of Crete, Greece;Lawrence Livermore National Lab;Lawrence Livermore National Lab

  • Venue:
  • Proceedings of the 8th ACM International Conference on Computing Frontiers
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Remote DMA (RDMA) enables high performance networks to reduce data copying between an application and the operating system (OS). However RDMA operations in some high performance networks require communication memory explicitly registered with the network adapter and pinned by the OS. Memory registration and pinning limits the flexibility of the memory system and reduces the amount of memory that user processes can allocate. These issues become more significant on multicore platforms, since registered memory demand grows linearly with the number of processor cores. In this paper we propose a new memory registration/deregistration strategy to reduce registered memory on multicore architectures for HPC applications. We hide the cost of dynamic memory management by offloading all dynamic memory registration and deregistration requests to a dedicated memory management helper thread. We investigate design policies and performance implications of the helper thread approach. We evaluate our framework with the NAS parallel benchmarks, for which our registration scheme significantly reduces the registered memory (23.62% on average and up to 49.39%) and avoids memory registration/deregistration costs for reused communication memory. We show that our system enables the execution of problem sizes that could not complete under existing memory registration strategies.