Distilling the essence of proprietary workloads into miniature benchmarks

Authors:
Ajay Joshi;Lieven Eeckhout;Robert H. Bell, Jr.;Lizy K. John
Affiliations:
University of Texas at Austin, Austin, Texas;Ghent University, Belgium;IBM, Austin, Texas;University of Texas at Austin, Austin, Texas
Venue:
ACM Transactions on Architecture and Code Optimization (TACO)
Year:
2008

Citing 31
Cited 7

Benchmark Synthesis Using the LRU Cache Hit Function

IEEE Transactions on Computers
An Overview of Common Benchmarks

Computer
A new approach to I/O performance evaluation: self-scaling I/O benchmarks, predicted I/O performance

SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Theoretical modeling of superscalar processor performance

MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Analysis of benchmark characteristics and benchmark performance prediction

ACM Transactions on Computer Systems (TOCS)
HLS: combining statistical and symbolic simulation to guide microprocessor designs

Proceedings of the 27th annual international symposium on Computer architecture
Wattch: a framework for architectural-level power analysis and optimizations

Proceedings of the 27th annual international symposium on Computer architecture
Dhrystone: a synthetic systems programming benchmark

Communications of the ACM
On the construction of a representative synthetic workload

Communications of the ACM
Speculative precomputation: long-range prefetching of delinquent loads

ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Choosing representative slices of program execution for microarchitecture simulations: a preliminary application to the data stream

Workload characterization of emerging computer applications
Efficient discovery of regular stride patterns in irregular programs and its use in compiler prefetching

PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation
Automatically characterizing large scale program behavior

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
SPEC CPU2000: Measuring CPU Performance in the New Millennium

Computer
Instruction Window Size Trade-Offs and Characterization of Program Parallelism

IEEE Transactions on Computers
Reducing State Loss For Effective Trace Sampling of Superscalar Processors

ICCD '96 Proceedings of the 1996 International Conference on Computer Design, VLSI in Computers and Processors
Hybrid Analytical-Statistical Modeling for Efficiently Exploring Architecture and Workload Design Spaces

Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Modeling Superscalar Processors via Statistical Simulation

Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
On the foundations of artificial workload design

SIGMETRICS '84 Proceedings of the 1984 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Representative Traces for Processor Models with Infinite Cache

HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
SMARTS: accelerating microarchitecture simulation via rigorous statistical sampling

Proceedings of the 30th annual international symposium on Computer architecture
Challenges in Computer Architecture Evaluation

Computer
Control Flow Modeling in Statistical Simulation for Accurate and Efficient Processor Design Studies

Proceedings of the 31st annual international symposium on Computer architecture
Predicting Inter-Thread Cache Contention on a Chip Multi-Processor Architecture

HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Pin: building customized program analysis tools with dynamic instrumentation

Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Improved automatic testcase synthesis for performance model validation

Proceedings of the 19th annual international conference on Supercomputing
Efficient design space exploration of high performance embedded out-of-order processors

Proceedings of the conference on Design, automation and test in Europe: Proceedings
Intrinsic Checkpointing: A Methodology for Decreasing Simulation Time Through Binary Modification

ISPASS '05 Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2005
Memory Data Flow Modeling in Statistical Simulation for the Efficient Exploration of Microprocessor Design Spaces

IEEE Transactions on Computers
Statistical Simulation: Adding Efficiency to the Computer Designer's Toolbox

IEEE Micro
Microprocessor power estimation using profile-driven program synthesis

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

Rapid early-stage microarchitecture design using predictive models

ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
SWEEP: evaluating computer system energy efficiency using synthetic workloads

Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers
MAximum Multicore POwer (MAMPO): an automatic multithreaded synthetic power virus generation framework for multicore systems

Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Validating model-driven performance predictions on random software systems

QoSA'10 Proceedings of the 6th international conference on Quality of Software Architectures: research into Practice - Reality and Gaps
Evaluating program analysis and testing tools with the RUGRAT random benchmark application generator

Proceedings of the 2012 Workshop on Dynamic Analysis
CarFast: achieving higher statement coverage faster

Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
Systematic Energy Characterization of CMP/SMT Processor Systems via Automated Micro-Benchmarks

MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture

Quantified Score

Hi-index	0.00

Visualization

Abstract

Benchmarks set standards for innovation in computer architecture research and industry product development. Consequently, it is of paramount importance that these workloads are representative of real-world applications. However, composing such representative workloads poses practical challenges to application analysis teams and benchmark developers (1) real-world workloads are intellectual property and vendors hesitate to share these proprietary applications; and (2) porting and reducing these applications to benchmarks that can be simulated in a tractable amount of time is a nontrivial task. In this paper, we address this problem by proposing a technique that automatically distills key inherent behavioral attributes of a proprietary workload and captures them into a miniature synthetic benchmark clone. The advantage of the benchmark clone is that it hides the functional meaning of the code but exhibits similar performance characteristics as the target application. Moreover, the dynamic instruction count of the synthetic benchmark clone is substantially shorter than the proprietary application, greatly reducing overall simulation time for SPEC CPU, the simulation time reduction is over five orders of magnitude compared to entire benchmark execution. Using a set of benchmarks representative of general-purpose, scientific, and embedded applications, we demonstrate that the power and performance characteristics of the synthetic benchmark clone correlate well with those of the original application across a wide range of microarchitecture configurations.