Improved automatic testcase synthesis for performance model validation

Authors:
Robert H. Bell, Jr.;Lizy K. John
Affiliations:
IBM Systems and Technology Division, Austin, Texas;The University of Texas at Austin
Venue:
Proceedings of the 19th annual international conference on Supercomputing
Year:
2005

Citing 21
Cited 14

Benchmark Synthesis Using the LRU Cache Hit Function

IEEE Transactions on Computers
On the Fractal Dimension of Computer Programs and its Application to the Prediction of the Cache Miss Ratio

IEEE Transactions on Computers
The SimpleScalar tool set, version 2.0

ACM SIGARCH Computer Architecture News
HLS: combining statistical and symbolic simulation to guide microprocessor designs

Proceedings of the 27th annual international symposium on Computer architecture
Dhrystone: a synthetic systems programming benchmark

Communications of the ACM
On the construction of a representative synthetic workload

Communications of the ACM
Measuring Experimental Error in Microprocessor Simulation

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Automatically characterizing large scale program behavior

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Performance Analysis and Its Impact on Design

Computer
Calibration of Microprocessor Performance Models

Computer
Environment for PowerPC Microarchitecture Exploration

IEEE Micro
Architectural Performance Verification: PowerPCTM Processors

ICCS '94 Proceedings of the1994 IEEE International Conference on Computer Design: VLSI in Computer & Processors
The construction and use of a general purpose synthetic program for an interactive benchmark on demand paged systems

ACM '77 Proceedings of the 1977 annual conference
SMARTS: accelerating microarchitecture simulation via rigorous statistical sampling

Proceedings of the 30th annual international symposium on Computer architecture
Reverse Tracer: A Software Tool for Generating Realistic Performance Test Programs

HPCA '02 Proceedings of the 8th International Symposium on High-Performance Computer Architecture
Challenges in Computer Architecture Evaluation

Computer
Control Flow Modeling in Statistical Simulation for Accurate and Efficient Processor Design Studies

Proceedings of the 31st annual international symposium on Computer architecture
lmbench: portable tools for performance analysis

ATEC '96 Proceedings of the 1996 annual conference on USENIX Annual Technical Conference
A multithreaded PowerPC processor for commercial servers

IBM Journal of Research and Development
POWER4 system microarchitecture

IBM Journal of Research and Development
Microprocessor power estimation using profile-driven program synthesis

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Cross-Platform Performance Prediction of Parallel Applications Using Partial Execution

SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
The exigency of benchmark and compiler drift: designing tomorrow's processors with yesterday's tools

Proceedings of the 20th annual international conference on Supercomputing
Accurate memory data flow modeling in statistical simulation

Proceedings of the 20th annual international conference on Supercomputing
Memory Data Flow Modeling in Statistical Simulation for the Efficient Exploration of Microprocessor Design Spaces

IEEE Transactions on Computers
Dispersing proprietary applications as benchmarks through code mutation

Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Distilling the essence of proprietary workloads into miniature benchmarks

ACM Transactions on Architecture and Code Optimization (TACO)
SWEEP: evaluating computer system energy efficiency using synthetic workloads

Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers
Proprietary code to non-proprietary benchmarks: synthesis techniques for scalable benchmarks

Proceedings of the 2nd ACM/SPEC International Conference on Performance engineering
Automatic performance model synthesis from hardware verification models

Proceedings of the 2nd ACM/SPEC International Conference on Performance engineering
Optimizing throughput/power trade-offs in hardware transactional memory using DVFS and intelligent scheduling

Proceedings of the international conference on Supercomputing
MAximum Multicore POwer (MAMPO): an automatic multithreaded synthetic power virus generation framework for multicore systems

Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
ScalaTrace: tracing, analysis and modeling of HPC codes at scale

PARA'10 Proceedings of the 10th international conference on Applied Parallel and Scientific Computing - Volume 2
Web based multi-platform benchmark program construction in smartphone

Proceedings of the 7th International Conference on Ubiquitous Information Management and Communication
Systematic Energy Characterization of CMP/SMT Processor Systems via Automated Micro-Benchmarks

MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

Performance simulation tools must be validated during the design process as functional models and early hardware are developed, so that designers can be sure of the performance of their designs as they implement changes. The current state-of-the-art is to use simple hand-coded bandwidth and latency testcases to assess early performance and to calibrate performance models. Applications and benchmark suites such as SPEC CPU are difficult to set up or take too long to execute on functional models. Short trace snippets from applications can be executed on performance and functional simulators, but not without difficulty on hardware, and there is no guarantee that hand-coded tests and short snippets cover the performance of the original applications.We present a new automatic testcase synthesis methodology to address these concerns. By basing testcase synthesis on the workload characteristics of an application, we create source code that largely represents the performance of the application, but which executes in a fraction of the runtime. We synthesize representative versions of the SPEC2000 benchmarks, compile and execute them, and obtain an average IPC within 2.4% of the average IPC of the original benchmarks with similar average workload characteristics. In addition, the changes in IPC due to design changes are found to be proportional to the changes in IPC for the original applications. The synthetic testcases execute more than three orders of magnitude faster than the original applications, typically in less than 300K instructions, making performance model validation feasible.