Fast allocation and deallocation of memory based on object lifetimes
Software—Practice & Experience
Using lifetime predictors to improve memory allocation performance
PLDI '93 Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
Memory allocation costs in large C and C++ programs
Software—Practice & Experience
Memory management with explicit regions
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
The memory fragmentation problem: solved?
Proceedings of the 1st international symposium on Memory management
Segregating heap objects by reference behavior and lifetime
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Hoard: a scalable memory allocator for multithreaded applications
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Exploiting prolific types for memory management and optimizations
POPL '02 Proceedings of the 29th ACM SIGPLAN-SIGACT symposium on Principles of programming languages
Region-based memory management in cyclone
PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation
Reconsidering custom memory allocation
OOPSLA '02 Proceedings of the 17th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications
Dynamic Storage Allocation: A Survey and Critical Review
IWMM '95 Proceedings of the International Workshop on Memory Management
Microphase: an approach to proactively invoking garbage collection for improved performance
Proceedings of the 22nd annual ACM SIGPLAN conference on Object-oriented programming systems and applications
Jolt: lightweight dynamic analysis and removal of object churn
Proceedings of the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications
Deferred gratification: engineering for high performance garbage collection from the get go
Proceedings of the 2011 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness
Compartmental memory management in a modern web browser
Proceedings of the international symposium on Memory management
Reuse, recycle to de-bloat software
Proceedings of the 25th European conference on Object-oriented programming
Why nothing matters: the impact of zeroing
Proceedings of the 2011 ACM international conference on Object oriented programming systems languages and applications
Does lean imply green?: a study of the power performance implications of Java runtime bloat
Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems
Hi-index | 0.00 |
More and more server workloads are becoming Web-based. In these Web-based workloads, most of the memory objects are used only during one transaction. We study the effect of the memory management approaches on the performance of such Web-based applications on two modern multicore processors. In particular, using six PHP applications, we compare a general-purpose allocator (the default allocator of the PHP runtime) and a region-based allocator, which can reduce the cost of memory management by not supporting per-object free. The region-based allocator achieves better performance for all workloads on one processor core due to its smaller memory management cost. However, when using eight cores, the region-based allocator suffers from hidden costs of increased bus traffics and the performance is reduced for many workloads by as much as 27.2% compared to the default allocator. This is because the memory bandwidth tends to become a bottleneck in systems with multicore processors. We propose a new memory management approach, defrag-dodging, to maximize the performance of the Web-based workloads on multicore processors. In our approach, we reduce the memory management cost by avoiding defragmentation overhead in the malloc and free functions during a transaction. We found that the transactions in Web-based applications are short enough to ignore heap fragmentation, and hence the costs of the defrag-mentation activities in existing general-purpose allocators outweigh their benefits. By comparing our approach against the region-based approach, we show that a per-object free capability can reduce bus traffic and achieve higher performance on multicore processors. We demonstrate that our defrag-dodging approach improves the performance of all the evaluated applications on both processors by up to 11.4% and 51.5% over the default allocator and the region-based allocator, respectively.