Predicting program behavior using real or estimated profiles
PLDI '91 Proceedings of the ACM SIGPLAN 1991 conference on Programming language design and implementation
Predicting conditional branch directions from previous runs of a program
ASPLOS V Proceedings of the fifth international conference on Architectural support for programming languages and operating systems
Analysis of benchmark characteristics and benchmark performance prediction
ACM Transactions on Computer Systems (TOCS)
The intrinsic bandwidth requirements of ordinary programs
Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
The SimpleScalar tool set, version 2.0
ACM SIGARCH Computer Architecture News
Overcoming the challenges to feedback-directed optimization (Keynote Talk)
DYNAMO '00 Proceedings of the ACM SIGPLAN workshop on Dynamic and adaptive compilation and optimization
Adapting the SPEC 2000 benchmark suite for simulation-based computer architecture research
Workload characterization of emerging computer applications
Workload Characterization: Motivation, Goals and Methodology
WWC '98 Proceedings of the Workload Characterization: Methodology and Case Studies
On the Predictability of Program Behavior Using Different Input Data Sets
INTERACT '02 Proceedings of the Sixth Annual Workshop on Interaction between Compilers and Computer Architectures
Predicting whole-program locality through reuse distance analysis
PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
A Statistically Rigorous Approach for Improving Simulation Methodology
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
VHC: Quickly Building an Optimizer for Complex Embedded Architectures
Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
The Fuzzy Correlation between Code and Performance Predictability
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Improving Computer Architecture Simulation Methodology by Adding Statistical Rigor
IEEE Transactions on Computers
Simulation of Computer Architectures: Simulators, Benchmarks, Methodologies, and Recommendations
IEEE Transactions on Computers
The exigency of benchmark and compiler drift: designing tomorrow's processors with yesterday's tools
Proceedings of the 20th annual international conference on Supercomputing
Evaluating the correspondence between training and reference workloads in SPEC CPU2006
ACM SIGARCH Computer Architecture News
Dynamic prediction of architectural vulnerability from microarchitectural state
Proceedings of the 34th annual international symposium on Computer architecture
Miss Rate Prediction Across Program Inputs and Cache Configurations
IEEE Transactions on Computers
Speed versus Accuracy Trade-Offs in Microarchitectural Simulations
IEEE Transactions on Computers
Hardware counter driven on-the-fly request signatures
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
Workload Reduction for Multi-input Feedback-Directed Optimization
Proceedings of the 7th annual IEEE/ACM International Symposium on Code Generation and Optimization
Program locality analysis using reuse distance
ACM Transactions on Programming Languages and Systems (TOPLAS)
Fast model-based test case classification for performance analysis of multimedia MPSoC platforms
CODES+ISSS '09 Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis
Finding representative workloads for computer system design
Finding representative workloads for computer system design
Accurately evaluating application performance in simulated hybrid multi-tasking systems
Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays
Lightweight runtime control flow analysis for adaptive loop caching
Proceedings of the 20th symposium on Great lakes symposium on VLSI
Phase complexity surfaces: characterizing time-varying program behavior
HiPEAC'08 Proceedings of the 3rd international conference on High performance embedded architectures and compilers
Rapid early-stage microarchitecture design using predictive models
ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
SubsetTrio: An evolutionary, geometric, and statistical benchmark subsetting framework
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Automatic estimation of performance requirements for software tasks of mobile devices
Proceedings of the 2nd ACM/SPEC International Conference on Performance engineering
BarrierWatch: characterizing multithreaded workloads across and within program-defined epochs
Proceedings of the 8th ACM International Conference on Computing Frontiers
Reducing TPC-H benchmarking time
PCI'05 Proceedings of the 10th Panhellenic conference on Advances in Informatics
Characterizing time-varying program behavior using phase complexity surfaces
Transactions on High-Performance Embedded Architectures and Compilers IV
Adaptive loop caching using lightweight runtime control flow analysis
ACM Transactions on Embedded Computing Systems (TECS) - Special section on ESTIMedia'12, LCTES'11, rigorous embedded systems design, and multiprocessor system-on-chip for cyber-physical systems
Microarchitectural design space exploration made fast
Microprocessors & Microsystems
Quipu: A Statistical Model for Predicting Hardware Resources
ACM Transactions on Reconfigurable Technology and Systems (TRETS)
Selecting representative benchmark inputs for exploring microprocessor design spaces
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.02 |
Having a representative workload of the target domain of a microprocessor is extremely important throughout its design. The composition of a workload involves two issues: (i) which benchmarks to select and (ii) which input data sets to select per benchmark. Unfortunately, it is impossible to select a huge number of benchmarks and respective input sets due to the large instruction counts per benchmark and due to limitations on the available simulation time. In this paper, we use statistical data analysis techniques such as principal components analysis (PCA) and cluster analysis to efficiently explore the workload space. Within this workload space, different input data sets for a given benchmark can be displayed, a distance can be measured between program-input pairs that gives us an idea about their mutual behavioral differences and representative input data sets can be selected for the given benchmark. This methodology is validated by showing that program-input pairs that are close to each other in this workload space indeed exhibit similar behavior. The final goal is to select a limited set of representative benchmark-input pairs that span the complete workload space. Next to workload composition, there are a number of other possible applications, namely getting insight in the impact of input data sets on program behavior and profile-guided compiler optimizations.