A lightweight hybrid hardware/software approach for object-relative memory profiling

  • Authors:
  • Licheng Chen;Zehan Cui;Yungang Bao;Mingyu Chen;Yongbing Huang;Guangming Tan

  • Affiliations:
  • State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, China;State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, China;State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, China;State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, China;State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, China;State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, China

  • Venue:
  • ISPASS '12 Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Memory profiling is the process of collecting memory address traces during the execution of a program, then analyzing and characterizing the memory behavior of the program offline. With the trend that there will be more and more cores integrated in a processor chip, the "Memory Wall" problem will become more serious in the chip multiprocessor (CMP) system. Thus accurate and effective memory profiling is becoming one of the keys to identify the source of memory system bottlenecks. A large body of work has been contributed to memory profiling, however, most adopts instrumentation, simulator which suffers heavy overhead, or hardware performance counter which is lack of detail trace information. Furthermore, correlating the raw memory address traces with object-relative information allows us to separate regular pattern for certain object from the irregular mixed, thus helps the optimization. In this paper, we propose a lightweight hybrid hardware/software approach for object-relative memory profiling. We monitor physical memory addresses through hardware snooping with negligible overhead; meanwhile we dump Linux kernel page tables of processes, as well as object-relative memory allocation information. Our approach supports not only to collect applications' full memory traces with detail object relative information, but also to identify hardware-generated memory accesses such as page memory walks due to TLB miss at object level. The experimental results on real system show that our approach is highly accurate (the largest error is 2.04%) and low overhead (the average overhead is 1.60%). Furthermore, we profile two multi-thread applications in detail, and successfully identity hot TLB-miss objects. With object-targeted optimization, we can improve applications' performance by nearly 6.86%.