On understanding the energy consumption of ARM-based multicore servers

  • Authors:
  • Bogdan Marius Tudor;Yong Meng Teo

  • Affiliations:
  • National University of Singapore, Singapore, Singapore;National University of Singapore, Singapore, Singapore

  • Venue:
  • Proceedings of the ACM SIGMETRICS/international conference on Measurement and modeling of computer systems
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

There is growing interest to replace traditional servers with low-power multicore systems such as ARM Cortex-A9. However, such systems are typically provisioned for mobile applications that have lower memory and I/O requirements than server application. Thus, the impact and extent of the imbalance between application and system resources in exploiting energy efficient execution of server workloads is unclear. This paper proposes a trace-driven analytical model for understanding the energy performance of server workloads on ARM Cortex-A9 multicore systems. Key to our approach is the modeling of the degrees of CPU core, memory and I/O resource overlap, and in estimating the number of cores and clock frequency that optimizes energy performance without compromising execution time. Since energy usage is the product of utilized power and execution time, the model first estimates the execution time of a program. CPU time, which accounts for both cores and memory response time, is modeled as an M/G/1 queuing system. Workload characterization of high performance computing, web hosting and financial computing applications shows that bursty memory traffic fits a Pareto distribution, and non-bursty memory traffic is exponentially distributed. Our analysis using these server workloads reveals that not all server workloads might benefit from higher number of cores or clock frequencies. Applying our model, we predict the configurations that increase energy efficiency by 10% without turning off cores, and up to one third with shutting down unutilized cores. For memory-bounded programs, we show that the limited memory bandwidth might increase both execution time and energy usage, to the point where energy cost might be higher than on a typical x64 multicore system. Lastly, we show that increasing memory and I/O bandwidth can improve both the execution time and the energy usage of server workloads on ARM Cortex-A9 systems.