Allocation wall: a limiting factor of Java applications on emerging multi-core platforms
Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications
Tale in the multi-core era: is java still competitive to host SIP applications?
ICC'09 Proceedings of the 2009 IEEE international conference on Communications
THOR: a performance analysis tool for java applications running on multicore systems
IBM Journal of Research and Development
Hi-index | 0.00 |
As we enter the era of chip multiprocessor (CMP) architectures, it is important that we explore the scaling characteristics of mainstream server workloads on these platforms. In this paper, we analyze the performance of two significant Enterprise Java workloads (SPECjAppServer2004 and SPECjbb2005) on CMP platforms - present and future. We start by characterizing the core, cache and memory behavior of these workloads on the newly released Intel Core 2 Duo Xeon platform (dual-core, dual-socket). Our findings from these measurements indicate that these workloads have a significant performance dependence on cache and memory subsystems. In order to guide the evolution of future CMP platforms, we perform a detailed investigation of potential cache and memory architecture choices. This includes analyzing the effects of thread sharing and migration, object allocation and garbage collection. Based on the observed behavior, we propose architectural optimizations along three dimensions: (a) data-less cache line initialization (DCLI), (b) hardware-guided thread collocation (HGTC) and (c) on-socket DRAM caches (OSDC). In this paper, we will describe these optimizations in detail and validate their performance potential based on trace-driven simulations and execution-driven emulation. Overall, we expect that the findings in this paper will guide future CMP architectures for Enterprise Java servers.