Improving the cache locality of memory allocation

Authors:
Dirk Grunwald;Benjamin Zorn;Robert Henderson
Affiliations:
-;-;-
Venue:
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Year:
1993

Citing 18
Cited 35

Line (block) size choice for CPU cache memories

IEEE Transactions on Computers
An efficient algorithm for heap storage allocation

ACM SIGPLAN Notices
Software prefetching

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
The effect of context switches on cache performance

ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Effective “static-graph” reorganization to improve locality in garbage-collected systems

PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Performance debugging shared memory multiprocessor programs with MTOOL

Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Caching considerations for generational garbage collection

LFP '92 Proceedings of the 1992 ACM conference on LISP and functional programming
Empirical measurements of six allocation-intensive C programs

ACM SIGPLAN Notices
Optimally profiling and tracing programs

POPL '92 Proceedings of the 19th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Using lifetime predictors to improve memory allocation performance

PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers

ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
The software lookaside buffler reduces search overhead with linked lists

Communications of the ACM
A real-time garbage collector based on the lifetimes of objects

Communications of the ACM
Data Structure Techniques

Data Structure Techniques
Object Type Directed Garbage Collection To Improve Locality

IWMM '92 Proceedings of the International Workshop on Memory Management
Garbage collection in a large LISP system

LFP '84 Proceedings of the 1984 ACM Symposium on LISP and functional programming
Generation Scavenging: A non-disruptive high performance storage reclamation algorithm

SDE 1 Proceedings of the first ACM SIGSOFT/SIGPLAN software engineering symposium on Practical software development environments
Comparative performance evaluation of garbage collection algorithms

Comparative performance evaluation of garbage collection algorithms

Evaluating models of memory allocation

ACM Transactions on Modeling and Computer Simulation (TOMACS)
The influence of caches on the performance of heaps

Journal of Experimental Algorithmics (JEA)
Memory management with explicit regions

PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Memory allocation for long-running server applications

Proceedings of the 1st international symposium on Memory management
Segregating heap objects by reference behavior and lifetime

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Cache-conscious data placement

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Tornado: maximizing locality and concurrency in a shared memory multiprocessor operating system

OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
Automated data-member layout of heap objects to improve memory-hierarchy performance

ACM Transactions on Programming Languages and Systems (TOPLAS)
Hoard: a scalable memory allocator for multithreaded applications

ACM SIGPLAN Notices
Composing high-performance memory allocators

Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Hoard: a scalable memory allocator for multithreaded applications

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Avoiding initialization misses to the heap

ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Tuning garbage collection for reducing memory system energy in an embedded java environment

ACM Transactions on Embedded Computing Systems (TECS)
Shared State for Distributed Interactive Data Mining Applications

Distributed and Parallel Databases - Special issue: Parallel and distributed data mining
A proposal for a new hardware cache monitoring architecture

Proceedings of the 2002 workshop on Memory system performance
An algorithm with constant execution time for dynamic storage allocation

RTCSA '95 Proceedings of the 2nd International Workshop on Real-Time Computing Systems and Applications
A locality-improving dynamic memory allocator

Proceedings of the 2005 workshop on Memory system performance
Analyzing data reuse for cache reconfiguration

ACM Transactions on Embedded Computing Systems (TECS)
Practical Structure Layout Optimization and Advice

Proceedings of the International Symposium on Code Generation and Optimization
Scalable locality-conscious multithreaded memory allocation

Proceedings of the 5th international symposium on Memory management
DieHard: probabilistic memory safety for unsafe languages

Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
Whole-program optimization of global variable layout

Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Controlling garbage collection and heap growth to reduce the execution time of Java applications

ACM Transactions on Programming Languages and Systems (TOPLAS)
Exterminator: automatically correcting memory errors with high probability

Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
The slab allocator: an object-caching kernel memory allocator

USTC'94 Proceedings of the USENIX Summer 1994 Technical Conference on USENIX Summer 1994 Technical Conference - Volume 1
malloc() performance in a multithreaded Linux environment

ATEC '00 Proceedings of the annual conference on USENIX Annual Technical Conference
Object co-location and memory reuse for Java programs

ACM Transactions on Architecture and Code Optimization (TACO)
Two memory allocators that use hints to improve locality

Proceedings of the 2009 international symposium on Memory management
Memory management thread for heap allocation intensive sequential applications

Proceedings of the 10th workshop on MEmory performance: DEaling with Applications, systems and architecture
Custom memory allocation for free

LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing
Optimal resource management for a model driven LTE protocol stack on a multicore platform

Proceedings of the 8th ACM international workshop on Mobility management and wireless access
Data layout for cache performance on a multithreaded architecture

Transactions on high-performance embedded architectures and compilers III
Efficient protection against heap-based buffer overflows without resorting to magic

ICICS'06 Proceedings of the 8th international conference on Information and Communications Security
Cache and I/O efficent functional algorithms

POPL '13 Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Heap decomposition inference with linear programming

ECOOP'13 Proceedings of the 27th European conference on Object-Oriented Programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

The allocation and disposal of memory is a ubiquitous operation in most programs. Rarely do programmers concern themselves with details of memory allocators; most assume that memory allocators provided by the system perform well. This paper presents a performance evaluation of the reference locality of dynamic storage allocation algorithms based on trace-driven simualtion of five large allocation-intensive C programs. In this paper, we show how the design of a memory allocator can significantly affect the reference locality for various applications. Our measurements show that poor locality in sequential-fit allocation algorithms reduces program performance, both by increasing paging and cache miss rates. While increased paging can be debilitating on any architecture, cache misses rates are also important for modern computer architectures. We show that algorithms attempting to be space-efficient by coalescing adjacent free objects show poor reference locality, possibly negating the benefits of space efficiency. At the other extreme, algorithms can expend considerable effort to increase reference locality yet gain little in total execution performance. Our measurements suggest an allocator design that is both very fast and has good locality of reference.