Formal online methods for voltage/frequency control in multiple clock domain microprocessors
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Adding Limited Reconfigurability to Superscalar Processors
Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Automatic Synthesis of High-Speed Processor Simulators
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Pinpointing Representative Portions of Large Intel® Itanium® Programs with Dynamic Instrumentation
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
The Fuzzy Correlation between Code and Performance Predictability
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
How to use SimPoint to pick simulation points
ACM SIGMETRICS Performance Evaluation Review - Special issue on tools for computer architecture research
Accelerated warmup for sampled microarchitecture simulation
ACM Transactions on Architecture and Code Optimization (TACO)
Fast data-locality profiling of native execution
SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Hybrid simulation for embedded software energy estimation
Proceedings of the 42nd annual Design Automation Conference
A Simple Divide-and-Conquer Approach for Neural-Class Branch Prediction
Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Simulating Commercial Java Throughput Workloads: A Case Study
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Implementing Caches in a 3D Technology for High Performance Processors
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Balancing Resource Utilization to Mitigate Power Density in Processor Pipelines
Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Simulation of Computer Architectures: Simulators, Benchmarks, Methodologies, and Recommendations
IEEE Transactions on Computers
Revised Stride Data Value Predictor Design
HPCASIA '05 Proceedings of the Eighth International Conference on High-Performance Computing in Asia-Pacific Region
Selecting Software Phase Markers with Code Structure Analysis
Proceedings of the International Symposium on Code Generation and Optimization
Vulnerability analysis of L2 cache elements to single event upsets
Proceedings of the conference on Design, automation and test in Europe: Proceedings
SMA: a self-monitored adaptive cache warm-up scheme for microprocessor simulation
International Journal of Parallel Programming
CODES+ISSS '06 Proceedings of the 4th international conference on Hardware/software codesign and system synthesis
Accurate memory data flow modeling in statistical simulation
Proceedings of the 20th annual international conference on Supercomputing
Reducing Data Cache Susceptibility to Soft Errors
IEEE Transactions on Dependable and Secure Computing
A Sampling Method Focusing on Practicality
IEEE Micro
Fire-and-Forget: Load/Store Scheduling with No Store Queue at All
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Adaptive Caches: Effective Shaping of Cache Behavior to Workloads
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
NSL-BLRL: Efficient CacheWarmup for Sampled Processor Simulation
ANSS '06 Proceedings of the 39th annual Symposium on Simulation
Automated design of application specific superscalar processors: an analytical approach
Proceedings of the 34th annual international symposium on Computer architecture
Applying Statistical Sampling for Fast and Efficient Simulation of Commercial Workloads
IEEE Transactions on Computers
Speed versus Accuracy Trade-Offs in Microarchitectural Simulations
IEEE Transactions on Computers
IEEE Transactions on Computers
Optimal Power/Performance Pipeline Depth for SMT in Scaled Technologies
IEEE Transactions on Computers
Quantifying software vulnerability
Proceedings of the 2008 workshop on Radiation effects and fault tolerance in nanometer technologies
Analysing and improving clustering based sampling for microprocessor simulation
International Journal of High Performance Computing and Networking
MLP-Aware Runahead Threads in a Simultaneous Multithreading Processor
HiPEAC '09 Proceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers
Proceedings of the 2009 SPEC Benchmark Workshop on Computer Performance Evaluation and Benchmarking
Memory-level parallelism aware fetch policies for simultaneous multithreading processors
ACM Transactions on Architecture and Code Optimization (TACO)
A swarm-inspired resource distribution for SMT processors
Proceedings of the 3rd International Conference on Bio-Inspired Models of Network, Information and Computing Sytems
Shapeshifter: Dynamically changing pipeline width and speed to address process variations
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Branch Predictor Warmup for Sampled Simulation through Branch History Matching
Transactions on High-Performance Embedded Architectures and Compilers II
Design and optimization of the store vectors memory dependence predictor
ACM Transactions on Architecture and Code Optimization (TACO)
Dynamic thermal management via architectural adaptation
Proceedings of the 46th Annual Design Automation Conference
A hybrid local-global approach for multi-core thermal management
Proceedings of the 2009 International Conference on Computer-Aided Design
Accurately evaluating application performance in simulated hybrid multi-tasking systems
Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays
Branch history matching: branch predictor warmup for sampled simulation
HiPEAC'07 Proceedings of the 2nd international conference on High performance embedded architectures and compilers
Using dynamic binary instrumentation to generate multi-platform SimPoints: methodology and accuracy
HiPEAC'08 Proceedings of the 3rd international conference on High performance embedded architectures and compilers
Phase complexity surfaces: characterizing time-varying program behavior
HiPEAC'08 Proceedings of the 3rd international conference on High performance embedded architectures and compilers
IVEC: off-chip memory integrity protection for both security and reliability
Proceedings of the 37th annual international symposium on Computer architecture
Using hardware vulnerability factors to enhance AVF analysis
Proceedings of the 37th annual international symposium on Computer architecture
Trifecta: a nonspeculative scheme to exploit common, data-dependent subcritical paths
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Dynamically managed multithreaded reconfigurable architectures for chip multiprocessors
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Statistical sampling of microarchitecture simulation
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Detecting phases in parallel applications on shared memory architectures
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
ReMAP: A Reconfigurable Heterogeneous Multicore Architecture
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
The shape of the processor design space and its implications for early stage explorations
ACMOS'05 Proceedings of the 7th WSEAS international conference on Automatic control, modeling and simulation
Modulo path history for the reduction of pipeline overheads in path-based neural branch predictors
International Journal of Parallel Programming
Enhancing network processor simulation speed with statistical input sampling
HiPEAC'05 Proceedings of the First international conference on High Performance Embedded Architectures and Compilers
Power reduction of superscalar processor functional units by resizing adder-width
PATMOS'05 Proceedings of the 15th international conference on Integrated Circuit and System Design: power and Timing Modeling, Optimization and Simulation
Making power-efficient data value predictions
ACSAC'05 Proceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture
A technique to reduce static and dynamic power of functional units in high-performance processors
PATMOS'06 Proceedings of the 16th international conference on Integrated Circuit and System Design: power and Timing Modeling, Optimization and Simulation
Characterizing time-varying program behavior using phase complexity surfaces
Transactions on High-Performance Embedded Architectures and Compilers IV
Improving dynamic prediction accuracy through multi-level phase analysis
Proceedings of the 13th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, Tools and Theory for Embedded Systems
Extracting the optimal sampling frequency of applications using spectral analysis
Concurrency and Computation: Practice & Experience
eDoctor: automatically diagnosing abnormal battery drain issues on smartphones
nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
Scheduling optimization in multicore multithreaded microprocessors through dynamic modeling
Proceedings of the ACM International Conference on Computing Frontiers
Multi-level phase analysis for sampling simulation
Proceedings of the Conference on Design, Automation and Test in Europe
Hardware/software approaches for reducing the process variation impact on instruction fetches
ACM Transactions on Design Automation of Electronic Systems (TODAES) - Special Section on Networks on Chip: Architecture, Tools, and Methodologies
S/DC: a storage and energy efficient data prefetcher
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Hi-index | 0.01 |
Modern architecture research relies heavily on detailed pipeline simulation. Simulating the full execution of an industry standard benchmark can take weeks to months to complete. To address this issue we have recently proposed using Simulation Points (found by only examining basic block execution frequency profiles) to increase the efficiency and accuracy of simulation. Simulation points are a small set of execution samples that when combined represent the complete execution of the program.In this paper we present a statistically driven algorithm for forming clusters from which simulation points are chosen, and examine algorithms for picking simulation points earlier in a program's execution - in order to significantly reduce fast-forwarding time during simulation. In addition, we show that simulation points can be used independent of the underlying architecture. The points are generated once for a program/input pair by only examining the code executed. We show the points accurately track hardware metrics (e.g., performance and cache miss rates) between different architecture configurations. They can therefore be used across different architecture configurations to allow a designer to make accurate trade-off decisions between different configurations.