On the Predictability of Program Behavior Using Different Input Data Sets

Authors:
Wei Chung Hsu;Howard Chen;Pen Chung Yew;Dong-Yuan Chen
Affiliations:
-;-;-;-
Venue:
INTERACT '02 Proceedings of the Sixth Annual Workshop on Interaction between Compilers and Computer Architectures
Year:
2002

Citing 0
Cited 11

Designing Computer Architecture Research Workloads

Computer
Workload Design: Selecting Representative Program-Input Pairs

Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Predicting whole-program locality through reuse distance analysis

PLDI '03 Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation
SMARTS: accelerating microarchitecture simulation via rigorous statistical sampling

Proceedings of the 30th annual international symposium on Computer architecture
The Accuracy of Initial Prediction in Two-Phase Dynamic Binary Translators

Proceedings of the international symposium on Code generation and optimization: feedback-directed and runtime optimization
Instruction Based Memory Distance Analysis and its Application

Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Statistical sampling of microarchitecture simulation

ACM Transactions on Modeling and Computer Simulation (TOMACS)
Evaluating the correspondence between training and reference workloads in SPEC CPU2006

ACM SIGARCH Computer Architecture News
Program locality analysis using reuse distance

ACM Transactions on Programming Languages and Systems (TOPLAS)
Evaluating iterative optimization across 1000 datasets

PLDI '10 Proceedings of the 2010 ACM SIGPLAN conference on Programming language design and implementation
Deconstructing iterative optimization

ACM Transactions on Architecture and Code Optimization (TACO)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Smaller input data sets such as the test and the train input sets are commonly used in simulation to estimate the impact of architecture/micro-architecture features on the performance of SPEC benchmarks. They are also used for profile feedback compiler optimizations. In this paper, we examine the reliability of reduced input sets for performance simulation and profile feedback optimizations. We study the high level metrics such as IPC and procedure level profiles as well as lower level measurements such as execution paths exercised by various input sets on the SPEC2000int benchmark. Our study indicates that the test input sets are not suitable to be used for simulation because they do not have an execution profile similar to the reference input runs. The train data set is better than the test data sets at maintaining similar profiles to the reference input set. However, the observed execution paths leading to cache misses are very different between using the smaller input sets and the reference input sets. For current profile based optimizations, the differences in quality of profiles may not have a significant impact on performance, as tested on the Itanium processor with Intel compiler. However, we believe the impact of profile quality will be greater for more aggressive profile guided optimizations, such as cache prefetching.