Choosing representative slices of program execution for microarchitecture simulations: a preliminary application to the data stream

Authors:
Thierry Lafage;André Seznec
Affiliations:
IRISA, Rennes Cedex, France;IRISA, Rennes Cedex, France
Venue:
Workload characterization of emerging computer applications
Year:
2001

Citing 13
Cited 13

Cache performance of operating system and multiprogramming workloads

ACM Transactions on Computer Systems (TOCS)
Accurate Low-Cost Methods for Performance Evaluation of Cache Memory Systems

IEEE Transactions on Computers
A model for estimating trace-sample miss ratios

SIGMETRICS '91 Proceedings of the 1991 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Implementing stack simulation for highly-associative memories

SIGMETRICS '91 Proceedings of the 1991 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Profetching and memory system behavior of the SPEC95 benchmark suite

IBM Journal of Research and Development - Special issue: performance analysis and its impact on design
Computer architecture (2nd ed.): a quantitative approach

Computer architecture (2nd ed.): a quantitative approach
On the use of trace sampling for architectural studies of desktop applications

SIGMETRICS '99 Proceedings of the 1999 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Branch Prediction, Instruction-Window Size, and Cache Size: Performance Trade-Offs and Simulation Techniques

IEEE Transactions on Computers
Cache Memories

ACM Computing Surveys (CSUR)
Clustering Algorithms

Clustering Algorithms
The Alpha 21264 Microprocessor

IEEE Micro
Reducing State Loss For Effective Trace Sampling of Superscalar Processors

ICCD '96 Proceedings of the 1996 International Conference on Computer Design, VLSI in Computers and Processors
Representative Traces for Processor Models with Infinite Cache

HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture

Automatically characterizing large scale program behavior

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Predicting whole-program locality through reuse distance analysis

PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Automatic Synthesis of High-Speed Processor Simulators

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
EMPS: An Environment for Memory Performance Studies

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 10 - Volume 11
A multinomial clustering model for fast simulation of computer architecture designs

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Distilling the essence of proprietary workloads into miniature benchmarks

ACM Transactions on Architecture and Code Optimization (TACO)
Analysing and improving clustering based sampling for microprocessor simulation

International Journal of High Performance Computing and Networking
COTSon: infrastructure for full system simulation

ACM SIGOPS Operating Systems Review
Accurately evaluating application performance in simulated hybrid multi-tasking systems

Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays
A Simulation Framework for Rapid Analysis of Reconfigurable Computing Systems

ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Automatic estimation of performance requirements for software tasks of mobile devices

Proceedings of the 2nd ACM/SPEC International Conference on Performance engineering
MCEmu: A Framework for Software Development and Performance Analysis of Multicore Systems

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Fine-grained Benchmark Subsetting for System Selection

Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization

Quantified Score

Hi-index	0.01

Visualization

Abstract

Microarchitecture simulations are aimed at providing results representative of the behavior of a processor running and application. Due to CPU time constraints, only a few execution slices of a large application can often be simulated. The aim of this chapter is to propose a technique to choose a few program execution slices representative of the entire execution. We characterize the behavior of each consecutive slice executed. Then we use a statistical classification method to discriminate the execution slices and select the representative ones. In this chapter, we detail this approach and apply it to the data stream. Using data cache simulations on the SPEC95 programs, we show that slices representing 1.52% ( average upon all the SPEC95 but one ) of the overall program activity are as representative as trace sampling using a 10% sampling ratio.