XAMM: a high-performance automatic memory management system with memory-constrained designs

  • Authors:
  • Gansha Wu;Xin Zhou;Guei-Yuan Lueh;Jesse Z Fang;Peng Guo;Jinzhan Peng;Victor Ying

  • Affiliations:
  • Intel China Research Center, Beijing, China;Intel China Research Center, Beijing, China;Intel Coporation, Software Solutions Group, CA;Intel Coporation, Corporate Technology Group, CA;Intel China Research Center, Beijing, China;Intel China Research Center, Beijing, China;Intel China Research Center, Beijing, China

  • Venue:
  • HiPEAC'05 Proceedings of the First international conference on High Performance Embedded Architectures and Compilers
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Automatic memory management has been prevalent on memory / computation constraint systems. Previous research has shown strong interest in small memory footprint, garbage collection (GC) pause time and energy consumption, while performance was left out of the spotlight. This fact inspired us to design memory management techniques delivering high performance, while still keeping space consumption and response time under control. XAMM is an attempt to answer such a quest. Driven by the design decisions above, XAMM implements a variety of novel techniques, including object model, heap management, allocation and GC mechanisms. XAMM also adopts techniques that can not only exploit the underlying system's capabilities, but can also assist the optimizations by other runtime components (e.g. code generator). This paper describes these techniques in details and reports our experiences in the implementation. We conclude that XAMM demonstrates the feasibility to achieve high performance without breaking memory constraints. We support our claims with evaluation results, for a spectrum of real-world programs and synthetic benchmarks. For example, the heap placement optimization can boost the system-wide performance by as much as 10%; the lazy and selective location bits management can reduce the execution time by as much as 14%, while reducing GC pause time on average by as much as 25%. The sum of these techniques improves the system-wide performance by as much as 56%.