Real-time concurrent collection on stock multiprocessors
PLDI '88 Proceedings of the ACM SIGPLAN 1988 conference on Programming Language design and Implementation
Improving the cache locality of memory allocation
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
CustoMalloc: efficient synthesized memory allocators
Software—Practice & Experience
Evaluating models of memory allocation
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Memory allocation costs in large C and C++ programs
Software—Practice & Experience
Garbage collection: algorithms for automatic dynamic memory management
Garbage collection: algorithms for automatic dynamic memory management
Dynamic Storage Allocation: A Survey and Critical Review
IWMM '95 Proceedings of the International Workshop on Memory Management
Scalability of dynamic storage allocation algorithms
FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
New methods for dynamic storage allocation (Fast Fits)
SOSP '83 Proceedings of the ninth ACM symposium on Operating systems principles
Hoard: a scalable memory allocator for multithreaded applications
ACM SIGPLAN Notices
Hoard: a scalable memory allocator for multithreaded applications
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Proceedings of the 3rd international symposium on Memory management
Are Mallocs Free of Fragmentation?
Proceedings of the FREENIX Track: 2001 USENIX Annual Technical Conference
Improving server software support for simultaneous multithreaded processors
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Scalable lock-free dynamic memory allocation
Proceedings of the ACM SIGPLAN 2004 conference on Programming language design and implementation
The design and analysis of a quantitative simulator for dynamic memory management
Journal of Systems and Software
Scalable locality-conscious multithreaded memory allocation
Proceedings of the 5th international symposium on Memory management
Clustering the heap in multi-threaded applications for improved garbage collection
Proceedings of the 8th annual conference on Genetic and evolutionary computation
A tunable hybrid memory allocator
Journal of Systems and Software
malloc() performance in a multithreaded Linux environment
ATEC '00 Proceedings of the annual conference on USENIX Annual Technical Conference
Enabling scalability and performance in a large scale CMP environment
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Allocating memory in a lock-free manner
ESA'05 Proceedings of the 13th annual European conference on Algorithms
Optimizing c multithreaded memory management using thread-local storage
CC'05 Proceedings of the 14th international conference on Compiler Construction
SSMalloc: a low-latency, locality-conscious memory allocator with stable performance scalability
Proceedings of the Asia-Pacific Workshop on Systems
SSMalloc: a low-latency, locality-conscious memory allocator with stable performance scalability
APSys'12 Proceedings of the Third ACM SIGOPS Asia-Pacific conference on Systems
AS-GC: an efficient generational garbage collector for java application servers
ECOOP'07 Proceedings of the 21st European conference on Object-Oriented Programming
ACDC: towards a universal mutator for benchmarking heap management systems
Proceedings of the 2013 international symposium on memory management
Power-aware dynamic memory management on many-core platforms utilizing DVFS
ACM Transactions on Embedded Computing Systems (TECS) - Special Section on ESTIMedia'10
Hi-index | 0.00 |
Prior work on dynamic memory allocation has largely neglected long-running server applications, for example, web servers and mail servers. Their requirements differ from those of one-shot applications like compilers or text editors. We investigated how to build an allocator that is not only fast and memory efficient but also scales well on SMP machines. We found that it is not sufficient to focus on reducing lock contention - higher speedups require a reduction in cache misses and bus traffic. We then designed and prototyped a new allocator, called LKmalloc, targeted for both traditional applications and server applications. LKmalloc uses several subheaps, each one with a separate set of free lists and memory arena. A thread always allocates from the same subheap but can free a block belonging to any subheap. A thread is assigned to a subheap by hashing on its thread ID. WC compared its performance with several other allocators on a server-like, simulated workload and found that it indeed scales well and is quite fast hut memory more efficiently.