Line (block) size choice for CPU cache memories
IEEE Transactions on Computers
An efficient algorithm for heap storage allocation
ACM SIGPLAN Notices
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
The effect of context switches on cache performance
ASPLOS IV Proceedings of the fourth international conference on Architectural support for programming languages and operating systems
Effective “static-graph” reorganization to improve locality in garbage-collected systems
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Performance debugging shared memory multiprocessor programs with MTOOL
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Caching considerations for generational garbage collection
LFP '92 Proceedings of the 1992 ACM conference on LISP and functional programming
Empirical measurements of six allocation-intensive C programs
ACM SIGPLAN Notices
Optimally profiling and tracing programs
POPL '92 Proceedings of the 19th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Using lifetime predictors to improve memory allocation performance
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
The software lookaside buffler reduces search overhead with linked lists
Communications of the ACM
A real-time garbage collector based on the lifetimes of objects
Communications of the ACM
Data Structure Techniques
Object Type Directed Garbage Collection To Improve Locality
IWMM '92 Proceedings of the International Workshop on Memory Management
Garbage collection in a large LISP system
LFP '84 Proceedings of the 1984 ACM Symposium on LISP and functional programming
Generation Scavenging: A non-disruptive high performance storage reclamation algorithm
SDE 1 Proceedings of the first ACM SIGSOFT/SIGPLAN software engineering symposium on Practical software development environments
Comparative performance evaluation of garbage collection algorithms
Comparative performance evaluation of garbage collection algorithms
Evaluating models of memory allocation
ACM Transactions on Modeling and Computer Simulation (TOMACS)
The influence of caches on the performance of heaps
Journal of Experimental Algorithmics (JEA)
Memory management with explicit regions
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Memory allocation for long-running server applications
Proceedings of the 1st international symposium on Memory management
Segregating heap objects by reference behavior and lifetime
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Cache-conscious data placement
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Tornado: maximizing locality and concurrency in a shared memory multiprocessor operating system
OSDI '99 Proceedings of the third symposium on Operating systems design and implementation
Automated data-member layout of heap objects to improve memory-hierarchy performance
ACM Transactions on Programming Languages and Systems (TOPLAS)
Hoard: a scalable memory allocator for multithreaded applications
ACM SIGPLAN Notices
Composing high-performance memory allocators
Proceedings of the ACM SIGPLAN 2001 conference on Programming language design and implementation
Hoard: a scalable memory allocator for multithreaded applications
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Avoiding initialization misses to the heap
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Tuning garbage collection for reducing memory system energy in an embedded java environment
ACM Transactions on Embedded Computing Systems (TECS)
Shared State for Distributed Interactive Data Mining Applications
Distributed and Parallel Databases - Special issue: Parallel and distributed data mining
A proposal for a new hardware cache monitoring architecture
Proceedings of the 2002 workshop on Memory system performance
An algorithm with constant execution time for dynamic storage allocation
RTCSA '95 Proceedings of the 2nd International Workshop on Real-Time Computing Systems and Applications
A locality-improving dynamic memory allocator
Proceedings of the 2005 workshop on Memory system performance
Analyzing data reuse for cache reconfiguration
ACM Transactions on Embedded Computing Systems (TECS)
Practical Structure Layout Optimization and Advice
Proceedings of the International Symposium on Code Generation and Optimization
Scalable locality-conscious multithreaded memory allocation
Proceedings of the 5th international symposium on Memory management
DieHard: probabilistic memory safety for unsafe languages
Proceedings of the 2006 ACM SIGPLAN conference on Programming language design and implementation
Whole-program optimization of global variable layout
Proceedings of the 15th international conference on Parallel architectures and compilation techniques
Controlling garbage collection and heap growth to reduce the execution time of Java applications
ACM Transactions on Programming Languages and Systems (TOPLAS)
Exterminator: automatically correcting memory errors with high probability
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation
The slab allocator: an object-caching kernel memory allocator
USTC'94 Proceedings of the USENIX Summer 1994 Technical Conference on USENIX Summer 1994 Technical Conference - Volume 1
malloc() performance in a multithreaded Linux environment
ATEC '00 Proceedings of the annual conference on USENIX Annual Technical Conference
Object co-location and memory reuse for Java programs
ACM Transactions on Architecture and Code Optimization (TACO)
Two memory allocators that use hints to improve locality
Proceedings of the 2009 international symposium on Memory management
Memory management thread for heap allocation intensive sequential applications
Proceedings of the 10th workshop on MEmory performance: DEaling with Applications, systems and architecture
Custom memory allocation for free
LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing
Optimal resource management for a model driven LTE protocol stack on a multicore platform
Proceedings of the 8th ACM international workshop on Mobility management and wireless access
Data layout for cache performance on a multithreaded architecture
Transactions on high-performance embedded architectures and compilers III
Efficient protection against heap-based buffer overflows without resorting to magic
ICICS'06 Proceedings of the 8th international conference on Information and Communications Security
Cache and I/O efficent functional algorithms
POPL '13 Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Heap decomposition inference with linear programming
ECOOP'13 Proceedings of the 27th European conference on Object-Oriented Programming
Hi-index | 0.00 |
The allocation and disposal of memory is a ubiquitous operation in most programs. Rarely do programmers concern themselves with details of memory allocators; most assume that memory allocators provided by the system perform well. This paper presents a performance evaluation of the reference locality of dynamic storage allocation algorithms based on trace-driven simualtion of five large allocation-intensive C programs. In this paper, we show how the design of a memory allocator can significantly affect the reference locality for various applications. Our measurements show that poor locality in sequential-fit allocation algorithms reduces program performance, both by increasing paging and cache miss rates. While increased paging can be debilitating on any architecture, cache misses rates are also important for modern computer architectures. We show that algorithms attempting to be space-efficient by coalescing adjacent free objects show poor reference locality, possibly negating the benefits of space efficiency. At the other extreme, algorithms can expend considerable effort to increase reference locality yet gain little in total execution performance. Our measurements suggest an allocator design that is both very fast and has good locality of reference.