Picking Statistically Valid and Early Simulation Points

Authors:
Erez Perelman;Greg Hamerly;Brad Calder
Affiliations:
-;-;-
Venue:
Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
Year:
2003

Citing 0
Cited 66

Formal online methods for voltage/frequency control in multiple clock domain microprocessors

ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Adding Limited Reconfigurability to Superscalar Processors

Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques
Automatic Synthesis of High-Speed Processor Simulators

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Pinpointing Representative Portions of Large Intel® Itanium® Programs with Dynamic Instrumentation

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
The Fuzzy Correlation between Code and Performance Predictability

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
How to use SimPoint to pick simulation points

ACM SIGMETRICS Performance Evaluation Review - Special issue on tools for computer architecture research
Accelerated warmup for sampled microarchitecture simulation

ACM Transactions on Architecture and Code Optimization (TACO)
Fast data-locality profiling of native execution

SIGMETRICS '05 Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Hybrid simulation for embedded software energy estimation

Proceedings of the 42nd annual Design Automation Conference
A Simple Divide-and-Conquer Approach for Neural-Class Branch Prediction

Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques
Simulating Commercial Java Throughput Workloads: A Case Study

ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Implementing Caches in a 3D Technology for High Performance Processors

ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Balancing Resource Utilization to Mitigate Power Density in Processor Pipelines

Proceedings of the 38th annual IEEE/ACM International Symposium on Microarchitecture
Simulation of Computer Architectures: Simulators, Benchmarks, Methodologies, and Recommendations

IEEE Transactions on Computers
Revised Stride Data Value Predictor Design

HPCASIA '05 Proceedings of the Eighth International Conference on High-Performance Computing in Asia-Pacific Region
Selecting Software Phase Markers with Code Structure Analysis

Proceedings of the International Symposium on Code Generation and Optimization
Vulnerability analysis of L2 cache elements to single event upsets

Proceedings of the conference on Design, automation and test in Europe: Proceedings
SMA: a self-monitored adaptive cache warm-up scheme for microprocessor simulation

International Journal of Parallel Programming
A multiprocessing approach to accelerate retargetable and portable dynamic-compiled instruction-set simulation

CODES+ISSS '06 Proceedings of the 4th international conference on Hardware/software codesign and system synthesis
Accurate memory data flow modeling in statistical simulation

Proceedings of the 20th annual international conference on Supercomputing
Reducing Data Cache Susceptibility to Soft Errors

IEEE Transactions on Dependable and Secure Computing
A Sampling Method Focusing on Practicality

IEEE Micro
Fire-and-Forget: Load/Store Scheduling with No Store Queue at All

Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Adaptive Caches: Effective Shaping of Cache Behavior to Workloads

Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
NSL-BLRL: Efficient CacheWarmup for Sampled Processor Simulation

ANSS '06 Proceedings of the 39th annual Symposium on Simulation
Automated design of application specific superscalar processors: an analytical approach

Proceedings of the 34th annual international symposium on Computer architecture
Applying Statistical Sampling for Fast and Efficient Simulation of Commercial Workloads

IEEE Transactions on Computers
Speed versus Accuracy Trade-Offs in Microarchitectural Simulations

IEEE Transactions on Computers
Memory Data Flow Modeling in Statistical Simulation for the Efficient Exploration of Microprocessor Design Spaces

IEEE Transactions on Computers
Optimal Power/Performance Pipeline Depth for SMT in Scaled Technologies

IEEE Transactions on Computers
Quantifying software vulnerability

Proceedings of the 2008 workshop on Radiation effects and fault tolerance in nanometer technologies
Analysing and improving clustering based sampling for microprocessor simulation

International Journal of High Performance Computing and Networking
MLP-Aware Runahead Threads in a Simultaneous Multithreading Processor

HiPEAC '09 Proceedings of the 4th International Conference on High Performance Embedded Architectures and Compilers
Generation, Validation and Analysis of SPEC CPU2006 Simulation Points Based on Branch, Memory and TLB Characteristics

Proceedings of the 2009 SPEC Benchmark Workshop on Computer Performance Evaluation and Benchmarking
Memory-level parallelism aware fetch policies for simultaneous multithreading processors

ACM Transactions on Architecture and Code Optimization (TACO)
A swarm-inspired resource distribution for SMT processors

Proceedings of the 3rd International Conference on Bio-Inspired Models of Network, Information and Computing Sytems
Shapeshifter: Dynamically changing pipeline width and speed to address process variations

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Branch Predictor Warmup for Sampled Simulation through Branch History Matching

Transactions on High-Performance Embedded Architectures and Compilers II
Design and optimization of the store vectors memory dependence predictor

ACM Transactions on Architecture and Code Optimization (TACO)
Dynamic thermal management via architectural adaptation

Proceedings of the 46th Annual Design Automation Conference
A hybrid local-global approach for multi-core thermal management

Proceedings of the 2009 International Conference on Computer-Aided Design
Accurately evaluating application performance in simulated hybrid multi-tasking systems

Proceedings of the 18th annual ACM/SIGDA international symposium on Field programmable gate arrays
Branch history matching: branch predictor warmup for sampled simulation

HiPEAC'07 Proceedings of the 2nd international conference on High performance embedded architectures and compilers
Using dynamic binary instrumentation to generate multi-platform SimPoints: methodology and accuracy

HiPEAC'08 Proceedings of the 3rd international conference on High performance embedded architectures and compilers
Phase complexity surfaces: characterizing time-varying program behavior

HiPEAC'08 Proceedings of the 3rd international conference on High performance embedded architectures and compilers
IVEC: off-chip memory integrity protection for both security and reliability

Proceedings of the 37th annual international symposium on Computer architecture
Using hardware vulnerability factors to enhance AVF analysis

Proceedings of the 37th annual international symposium on Computer architecture
Trifecta: a nonspeculative scheme to exploit common, data-dependent subcritical paths

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Dynamically managed multithreaded reconfigurable architectures for chip multiprocessors

Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Statistical sampling of microarchitecture simulation

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Detecting phases in parallel applications on shared memory architectures

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
ReMAP: A Reconfigurable Heterogeneous Multicore Architecture

MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
The shape of the processor design space and its implications for early stage explorations

ACMOS'05 Proceedings of the 7th WSEAS international conference on Automatic control, modeling and simulation
Modulo path history for the reduction of pipeline overheads in path-based neural branch predictors

International Journal of Parallel Programming
Enhancing network processor simulation speed with statistical input sampling

HiPEAC'05 Proceedings of the First international conference on High Performance Embedded Architectures and Compilers
Power reduction of superscalar processor functional units by resizing adder-width

PATMOS'05 Proceedings of the 15th international conference on Integrated Circuit and System Design: power and Timing Modeling, Optimization and Simulation
Making power-efficient data value predictions

ACSAC'05 Proceedings of the 10th Asia-Pacific conference on Advances in Computer Systems Architecture
A technique to reduce static and dynamic power of functional units in high-performance processors

PATMOS'06 Proceedings of the 16th international conference on Integrated Circuit and System Design: power and Timing Modeling, Optimization and Simulation
Characterizing time-varying program behavior using phase complexity surfaces

Transactions on High-Performance Embedded Architectures and Compilers IV
Improving dynamic prediction accuracy through multi-level phase analysis

Proceedings of the 13th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, Tools and Theory for Embedded Systems
Extracting the optimal sampling frequency of applications using spectral analysis

Concurrency and Computation: Practice & Experience
eDoctor: automatically diagnosing abnormal battery drain issues on smartphones

nsdi'13 Proceedings of the 10th USENIX conference on Networked Systems Design and Implementation
Scheduling optimization in multicore multithreaded microprocessors through dynamic modeling

Proceedings of the ACM International Conference on Computing Frontiers
Multi-level phase analysis for sampling simulation

Proceedings of the Conference on Design, Automation and Test in Europe
Hardware/software approaches for reducing the process variation impact on instruction fetches

ACM Transactions on Design Automation of Electronic Systems (TODAES) - Special Section on Networks on Chip: Architecture, Tools, and Methodologies
S/DC: a storage and energy efficient data prefetcher

DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe

Quantified Score

Hi-index	0.01

Visualization

Abstract

Modern architecture research relies heavily on detailed pipeline simulation. Simulating the full execution of an industry standard benchmark can take weeks to months to complete. To address this issue we have recently proposed using Simulation Points (found by only examining basic block execution frequency profiles) to increase the efficiency and accuracy of simulation. Simulation points are a small set of execution samples that when combined represent the complete execution of the program.In this paper we present a statistically driven algorithm for forming clusters from which simulation points are chosen, and examine algorithms for picking simulation points earlier in a program's execution - in order to significantly reduce fast-forwarding time during simulation. In addition, we show that simulation points can be used independent of the underlying architecture. The points are generated once for a program/input pair by only examining the code executed. We show the points accurately track hardware metrics (e.g., performance and cache miss rates) between different architecture configurations. They can therefore be used across different architecture configurations to allow a designer to make accurate trade-off decisions between different configurations.