KMA: A Dynamic Memory Manager for OpenCL

Authors:
Roy Spliet;Lee Howes;Benedict R. Gaster;Ana Lucia Varbanescu
Affiliations:
Delft University of Technology, The Netherlands;AMD, USA;AMD, USA;University of Amsterdam, The Netherlands
Venue:
Proceedings of Workshop on General Purpose Processing Using GPUs
Year:
2014

Citing 12
Cited 0

Hoard: a scalable memory allocator for multithreaded applications

ACM SIGPLAN Notices
Mostly lock-free malloc

Proceedings of the 3rd international symposium on Memory management
Simple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue Algorithms

Simple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue Algorithms
Scalable lock-free dynamic memory allocation

Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
Lock-free deques and doubly linked lists

Journal of Parallel and Distributed Computing
A view of the parallel computing landscape

Communications of the ACM - A View of Parallel Computing
Rodinia: A benchmark suite for heterogeneous computing

IISWC '09 Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC)
Regular Expression Matching on Graphics Hardware for Intrusion Detection

RAID '09 Proceedings of the 12th International Symposium on Recent Advances in Intrusion Detection
XMalloc: A Scalable Lock-free Dynamic Memory Allocator for Many-core Machines

CIT '10 Proceedings of the 2010 10th IEEE International Conference on Computer and Information Technology
A Comprehensive Performance Comparison of CUDA and OpenCL

ICPP '11 Proceedings of the 2011 International Conference on Parallel Processing
Performance Gaps between OpenMP and OpenCL for Multi-core CPUs

ICPPW '12 Proceedings of the 2012 41st International Conference on Parallel Processing Workshops
Fast dynamic memory allocator for massively parallel architectures

Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units

Quantified Score

Hi-index	0.00

Visualization

Abstract

OpenCL is becoming a popular choice for the parallel programming of both multi-core CPUs and GPGPUs. One of the features missing in OpenCL, yet commonly required in irregular parallel applications, is dynamic memory allocation. In this paper, we propose KMA, a first dynamic memory allocator for OpenCL. KMA's design is based on a thorough analysis of a set of 11 algorithms, which shows that dynamic memory allocation is a necessary commodity, typically used for implementing complex data structures (arrays, lists, trees) that need constant restructuring at runtime. Taking into account both the survey findings and the status-quo of OpenCL, we design KMA as a two-layer memory manager that makes smart use of the patterns we identified in our application analysis: its basic functionality provides generic malloc() and free() APIs, while the higher layer provides support for building and efficiently managing dynamic data structures. Our experiments measure the performance and usability of KMA, using both microbenchmarks and a real-life case-study. Results show that when dynamic allocation is mandatory, KMA is a competitive allocator. We conclude that embedding dynamic memory allocation in OpenCL is feasible, but it is a complex, delicate task due to the massive parallelism of the platform and the portability issues between different OpenCL implementations.