A model for estimating trace-sample miss ratios
SIGMETRICS '91 Proceedings of the 1991 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Combining Trace Sampling with Single Pass Methods for Efficient Cache Simulation
IEEE Transactions on Computers
Automatically characterizing large scale program behavior
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
A Comparison of Trace-Sampling Techniques for Multi-Megabyte Caches
IEEE Transactions on Computers
Reducing State Loss For Effective Trace Sampling of Superscalar Processors
ICCD '96 Proceedings of the 1996 International Conference on Computer Design, VLSI in Computers and Processors
X-means: Extending K-means with Efficient Estimation of the Number of Clusters
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Basic Block Distribution Analysis to Find Periodic Behavior and Simulation Points in Applications
Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Experiments with Random Projection
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
SMARTS: accelerating microarchitecture simulation via rigorous statistical sampling
Proceedings of the 30th annual international symposium on Computer architecture
Picking Statistically Valid and Early Simulation Points
Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
Memory reference reuse latency: Accelerated warmup for sampled microarchitecture simulation
ISPASS '03 Proceedings of the 2003 IEEE International Symposium on Performance Analysis of Systems and Software
A co-phase matrix to guide simultaneous multithreading simulation
ISPASS '04 Proceedings of the 2004 IEEE International Symposium on Performance Analysis of Systems and Software
Structures for phase classification
ISPASS '04 Proceedings of the 2004 IEEE International Symposium on Performance Analysis of Systems and Software
Automatic phase detection for stochastic on-chip traffic generation
CODES+ISSS '06 Proceedings of the 4th international conference on Hardware/software codesign and system synthesis
The exigency of benchmark and compiler drift: designing tomorrow's processors with yesterday's tools
Proceedings of the 20th annual international conference on Supercomputing
Proceedings of the 2009 SPEC Benchmark Workshop on Computer Performance Evaluation and Benchmarking
Parallel simulation of chip-multiprocessor by using multi-threading
AsiaMS '07 Proceedings of the IASTED Asian Conference on Modelling and Simulation
Extending data prefetching to cope with context switch misses
ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
A statistical performance model of the opteron processor
ACM SIGMETRICS Performance Evaluation Review - Special issue on the 1st international workshop on performance modeling, benchmarking and simulation of high performance computing systems (PMBS 10)
FELI: HW/SW support for on-chip distributed shared memory in multicores
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
PAS2P tool, parallel application signature for performance prediction
PARA'10 Proceedings of the 10th international conference on Applied Parallel and Scientific Computing - Volume Part I
Hi-index | 0.00 |
Understanding the cycle level behavior of a processor running an application is crucial to modern computer architecture research. To gain this understanding, detailed cycle level simulators are typically employed. Unfortunately, this level of detail comes at the cost of speed, and simulating the full execution of an industry standard benchmark on even the fastest simulator can take weeks to months to complete. This fact has not gone unnoticed, and several techniques have been developed aimed at reducing simulation time.