Benchmark Synthesis Using the LRU Cache Hit Function
IEEE Transactions on Computers
An Overview of Common Benchmarks
Computer
A new approach to I/O performance evaluation: self-scaling I/O benchmarks, predicted I/O performance
SIGMETRICS '93 Proceedings of the 1993 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Theoretical modeling of superscalar processor performance
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Analysis of benchmark characteristics and benchmark performance prediction
ACM Transactions on Computer Systems (TOCS)
HLS: combining statistical and symbolic simulation to guide microprocessor designs
Proceedings of the 27th annual international symposium on Computer architecture
Wattch: a framework for architectural-level power analysis and optimizations
Proceedings of the 27th annual international symposium on Computer architecture
Dhrystone: a synthetic systems programming benchmark
Communications of the ACM
On the construction of a representative synthetic workload
Communications of the ACM
Speculative precomputation: long-range prefetching of delinquent loads
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Workload characterization of emerging computer applications
PLDI '02 Proceedings of the ACM SIGPLAN 2002 Conference on Programming language design and implementation
Automatically characterizing large scale program behavior
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Instruction Window Size Trade-Offs and Characterization of Program Parallelism
IEEE Transactions on Computers
Reducing State Loss For Effective Trace Sampling of Superscalar Processors
ICCD '96 Proceedings of the 1996 International Conference on Computer Design, VLSI in Computers and Processors
Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Modeling Superscalar Processors via Statistical Simulation
Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
On the foundations of artificial workload design
SIGMETRICS '84 Proceedings of the 1984 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Representative Traces for Processor Models with Infinite Cache
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
SMARTS: accelerating microarchitecture simulation via rigorous statistical sampling
Proceedings of the 30th annual international symposium on Computer architecture
Control Flow Modeling in Statistical Simulation for Accurate and Efficient Processor Design Studies
Proceedings of the 31st annual international symposium on Computer architecture
Predicting Inter-Thread Cache Contention on a Chip Multi-Processor Architecture
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Pin: building customized program analysis tools with dynamic instrumentation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Improved automatic testcase synthesis for performance model validation
Proceedings of the 19th annual international conference on Supercomputing
Efficient design space exploration of high performance embedded out-of-order processors
Proceedings of the conference on Design, automation and test in Europe: Proceedings
Intrinsic Checkpointing: A Methodology for Decreasing Simulation Time Through Binary Modification
ISPASS '05 Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2005
IEEE Transactions on Computers
Microprocessor power estimation using profile-driven program synthesis
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Rapid early-stage microarchitecture design using predictive models
ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
SWEEP: evaluating computer system energy efficiency using synthetic workloads
Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Validating model-driven performance predictions on random software systems
QoSA'10 Proceedings of the 6th international conference on Quality of Software Architectures: research into Practice - Reality and Gaps
Evaluating program analysis and testing tools with the RUGRAT random benchmark application generator
Proceedings of the 2012 Workshop on Dynamic Analysis
CarFast: achieving higher statement coverage faster
Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering
Systematic Energy Characterization of CMP/SMT Processor Systems via Automated Micro-Benchmarks
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Hi-index | 0.00 |
Benchmarks set standards for innovation in computer architecture research and industry product development. Consequently, it is of paramount importance that these workloads are representative of real-world applications. However, composing such representative workloads poses practical challenges to application analysis teams and benchmark developers (1) real-world workloads are intellectual property and vendors hesitate to share these proprietary applications; and (2) porting and reducing these applications to benchmarks that can be simulated in a tractable amount of time is a nontrivial task. In this paper, we address this problem by proposing a technique that automatically distills key inherent behavioral attributes of a proprietary workload and captures them into a miniature synthetic benchmark clone. The advantage of the benchmark clone is that it hides the functional meaning of the code but exhibits similar performance characteristics as the target application. Moreover, the dynamic instruction count of the synthetic benchmark clone is substantially shorter than the proprietary application, greatly reducing overall simulation time for SPEC CPU, the simulation time reduction is over five orders of magnitude compared to entire benchmark execution. Using a set of benchmarks representative of general-purpose, scientific, and embedded applications, we demonstrate that the power and performance characteristics of the synthetic benchmark clone correlate well with those of the original application across a wide range of microarchitecture configurations.