Experiences in building and scaling an enterprise application on multicore systems

Authors:
Seetharami Seelam;Yanbin Liu;Parijat Dube;Megumi Ito;Deniz Binay;Michael Dawson;Pramod Nagaraja;Graeme Johnson;Liana Fong;Michel Hack;Xiaoqiao Meng;Yuqing Gao;Li Zhang
Affiliations:
IBM Thomas J. Watson Research Center, Yorktown Heights, NYUSA;IBM Thomas J. Watson Research Center, Yorktown Heights, NYUSA;IBM Thomas J. Watson Research Center, Yorktown Heights, NYUSA;IBM Tokyo Research Lab, Tokyo, Japan;IBM Thomas J. Watson Research Center, Yorktown Heights, NYUSA;IBM Software Group, Ottawa, Canada;IBM Software Group, Bangalore, India;IBM Software Group, Ottawa, Canada;IBM Thomas J. Watson Research Center, Yorktown Heights, NYUSA;IBM Thomas J. Watson Research Center, Yorktown Heights, NYUSA;IBM Thomas J. Watson Research Center, Yorktown Heights, NYUSA;IBM Thomas J. Watson Research Center, Yorktown Heights, NYUSA;IBM Thomas J. Watson Research Center, Yorktown Heights, NYUSA
Venue:
Concurrency and Computation: Practice & Experience
Year:
2012

Citing 4
Cited 0

Concurrent Programming in Java(TM): Design Principles and Patterns (3rd Edition) (Java (Addison-Wesley))

Concurrent Programming in Java(TM): Design Principles and Patterns (3rd Edition) (Java (Addison-Wesley))
A Productivity Centered Tools Framework for Application Performance Tuning

QEST '07 Proceedings of the Fourth International Conference on Quantitative Evaluation of Systems
Performance Studies of Commercial Workloads on a Multi-core System

IISWC '07 Proceedings of the 2007 IEEE 10th International Symposium on Workload Characterization
Analyzing and improving performance scalability of commercial server workloads on a chip multiprocessor

IISWC '09 Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Even though Java is the de facto programming language for enterprise applications, there exist only a limited number of Java-based benchmarks to understand the performance on emerging multicore systems. To bridge this gap, this paper presents a report generation benchmark that is developed on top of Open Source Apache Geronimo's DayTrader benchmark. Report generation and rendering is at the heart of many enterprise business analytics and business intelligence software products, and it is used by many enterprise applications. We evaluate the performance scalability of this benchmark on a state-of-the-art Power7 multicore system with 8 Power7 cores and 32 hardware threads. The benchmark throughput scales linearly up to eight hardware threads, but beyond that point, the throughput falls sharply. Significant locking in the Java class libraries for non-shared objects results in this performance drop. Splitting the locks on these shared classes results in near linear scaling from eight to 32 threads and improved the throughput by 80%. We also show that the Linux operating system load balancing could result in a degraded application performance in hardware multithreaded systems and simultaneous-multithreads-aware task scheduling results in uniform core-resource utilization as well as improved application performance. Copyright © 2011 John Wiley & Sons, Ltd.