Representative Multiprogram Workloads for Multithreaded Processor Simulation

Authors:
Michael Van Biesbrouck;Lieven Eeckhout;Brad Calder
Affiliations:
CSE, University of California, San Diego, USA. Email: mvanbies@cs.ucsd.edu;ELIS, Ghent University, Belgium. Email: leeckhou@elis.UGent.be;Microsoft. Email: calder@cs.ucsd.edu
Venue:
IISWC '07 Proceedings of the 2007 IEEE 10th International Symposium on Workload Characterization
Year:
2007

Citing 0
Cited 2

An Architecture Independent Approach to Emulating Computation Intensive Workload for Early Integration Testing of Enterprise DRE Systems

OTM '09 Proceedings of the Confederated International Conferences, CoopIS, DOA, IS, and ODBASE 2009 on On the Move to Meaningful Internet Systems: Part I
Selecting representative benchmark inputs for exploring microprocessor design spaces

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Almost all new consumer-grade processors are capable of executing multiple programs simultaneously. The analysis of multiprogrammed workloads for multicore and SMT processors is challenging and time-consuming because there are many possible combinations of benchmarks to execute and each combination may exhibit several different interesting behaviors. Missing particular combinations of program behaviors could hide performance problems with designs. It is thus of utmost importance to have a representative multiprogrammed workload when evaluating multithreaded processor designs. This paper presents a methodology that uses phase analysis, principal components analysis (PCA) and cluster analysis (CA) applied to microarchitecture-independent program characteristics in order to find important program interactions in multiprogrammed workloads. The end result is a small set of co-phases with associated weights that are representative for a multiprogrammed workload across multithreaded processor architectures. Applying our methodology to the SPEC CPU 2000 benchmark suite yields 50 distinct combinations for two-context multithreaded processor simulation that6 researchers and architects can use for simulation. Each combination is simulated for 50 million instructions, giving a total of 2.5 billion instructions to be simulated for the SPEC CPU2000 benchmark suite. The performance prediction error with these representative combinations is under 2.5% of the real workload for absolute throughput prediction and can be used to make relative throughput comparisons across processor architectures.