Cache performance of operating system and multiprogramming workloads
ACM Transactions on Computer Systems (TOCS)
Accurate Low-Cost Methods for Performance Evaluation of Cache Memory Systems
IEEE Transactions on Computers
A model for estimating trace-sample miss ratios
SIGMETRICS '91 Proceedings of the 1991 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Implementing stack simulation for highly-associative memories
SIGMETRICS '91 Proceedings of the 1991 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Profetching and memory system behavior of the SPEC95 benchmark suite
IBM Journal of Research and Development - Special issue: performance analysis and its impact on design
Computer architecture (2nd ed.): a quantitative approach
Computer architecture (2nd ed.): a quantitative approach
On the use of trace sampling for architectural studies of desktop applications
SIGMETRICS '99 Proceedings of the 1999 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
IEEE Transactions on Computers
ACM Computing Surveys (CSUR)
Clustering Algorithms
The Alpha 21264 Microprocessor
IEEE Micro
Reducing State Loss For Effective Trace Sampling of Superscalar Processors
ICCD '96 Proceedings of the 1996 International Conference on Computer Design, VLSI in Computers and Processors
Representative Traces for Processor Models with Infinite Cache
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
Automatically characterizing large scale program behavior
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Predicting whole-program locality through reuse distance analysis
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
Automatic Synthesis of High-Speed Processor Simulators
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
EMPS: An Environment for Memory Performance Studies
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 10 - Volume 11
A multinomial clustering model for fast simulation of computer architecture designs
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Distilling the essence of proprietary workloads into miniature benchmarks
ACM Transactions on Architecture and Code Optimization (TACO)
Analysing and improving clustering based sampling for microprocessor simulation
International Journal of High Performance Computing and Networking
COTSon: infrastructure for full system simulation
ACM SIGOPS Operating Systems Review
Accurately evaluating application performance in simulated hybrid multi-tasking systems
Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays
A Simulation Framework for Rapid Analysis of Reconfigurable Computing Systems
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Automatic estimation of performance requirements for software tasks of mobile devices
Proceedings of the 2nd ACM/SPEC International Conference on Performance engineering
MCEmu: A Framework for Software Development and Performance Analysis of Multicore Systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Fine-grained Benchmark Subsetting for System Selection
Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization
Hi-index | 0.01 |
Microarchitecture simulations are aimed at providing results representative of the behavior of a processor running and application. Due to CPU time constraints, only a few execution slices of a large application can often be simulated. The aim of this chapter is to propose a technique to choose a few program execution slices representative of the entire execution. We characterize the behavior of each consecutive slice executed. Then we use a statistical classification method to discriminate the execution slices and select the representative ones. In this chapter, we detail this approach and apply it to the data stream. Using data cache simulations on the SPEC95 programs, we show that slices representing 1.52% ( average upon all the SPEC95 but one ) of the overall program activity are as representative as trace sampling using a 10% sampling ratio.