Optimal pipelining in supercomputers
ISCA '86 Proceedings of the 13th annual international symposium on Computer architecture
Journal of Parallel and Distributed Computing
Theoretical modeling of superscalar processor performance
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Optimizing for reduced code space using genetic algorithms
Proceedings of the ACM SIGPLAN 1999 workshop on Languages, compilers, and tools for embedded systems
HLS: combining statistical and symbolic simulation to guide microprocessor designs
Proceedings of the 27th annual international symposium on Computer architecture
Inherently Lower-Power High-Performance Superscalar Architectures
IEEE Transactions on Computers
The optimum pipeline depth for a microprocessor
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
The optimal logic depth per pipeline stage is 6 to 8 FO4 inverter delays
ISCA '02 Proceedings of the 29th annual international symposium on Computer architecture
Automatically characterizing large scale program behavior
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Modeling Superscalar Processors via Statistical Simulation
Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
Representative Traces for Processor Models with Infinite Cache
HPCA '96 Proceedings of the 2nd IEEE Symposium on High-Performance Computer Architecture
SMARTS: accelerating microarchitecture simulation via rigorous statistical sampling
Proceedings of the 30th annual international symposium on Computer architecture
Computer Architecture: A Quantitative Approach
Computer Architecture: A Quantitative Approach
Single-ISA Heterogeneous Multi-Core Architectures for Multithreaded Workload Performance
Proceedings of the 31st annual international symposium on Computer architecture
Balancing hardware intensity in microprocessor pipelines
IBM Journal of Research and Development
IBM Journal of Research and Development
Fast and efficient searches for effective optimization-phase sequences
ACM Transactions on Architecture and Code Optimization (TACO)
Improving Computer Architecture Simulation Methodology by Adding Statistical Rigor
IEEE Transactions on Computers
Efficient design space exploration of high performance embedded out-of-order processors
Proceedings of the conference on Design, automation and test in Europe: Proceedings
POWER5 System microarchitecture
IBM Journal of Research and Development - POWER5 and packaging
Method-specific dynamic compilation using logistic regression
Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications
Accurate and efficient regression modeling for microarchitectural performance and power prediction
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Efficiently exploring architectural design spaces via predictive modeling
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
A Predictive Performance Model for Superscalar Processors
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Regression Modeling Strategies
Regression Modeling Strategies
Methods of inference and learning for performance modeling of parallel applications
Proceedings of the 12th ACM SIGPLAN symposium on Principles and practice of parallel programming
Automated design of application specific superscalar processors: an analytical approach
Proceedings of the 34th annual international symposium on Computer architecture
Measuring Program Similarity: Experiments with SPEC CPU Benchmark Suites
ISPASS '05 Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2005
Microarchitectural Design Space Exploration Using an Architecture-Centric Approach
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Efficiency trends and limits from comprehensive microarchitectural adaptivity
Proceedings of the 13th international conference on Architectural support for programming languages and operating systems
CPR: Composable performance regression for scalable multiprocessor models
Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
A mechanistic performance model for superscalar out-of-order processors
ACM Transactions on Computer Systems (TOCS)
Design and test strategies for microarchitectural post-fabrication tuning
ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
Empirical performance models for 3T1D memories
ICCD'09 Proceedings of the 2009 IEEE international conference on Computer design
An integrated framework for joint design space exploration of microarchitecture and circuits
Proceedings of the Conference on Design, Automation and Test in Europe
Integrated analysis of power and performance for pipelined microprocessors
IEEE Transactions on Computers
Hi-index | 0.00 |
We propose and apply a new simulation paradigm for microarchitectural design evaluation and optimization. This paradigm enables more comprehensive design studies by combining spatial sampling and statistical inference. Specifically, this paradigm (i) defines a large, comprehensive design space, (ii) samples points from the space for simulation, and (iii) constructs regression models based on sparse simulations. This approach greatly improves the computational efficiency of microarchitectural simulation and enables new capabilities in design space exploration. We illustrate new capabilities in three case studies for a large design space of approximately 260,000 points: (i) Pareto frontier, (ii) pipeline depth, and (iii) multiprocessor heterogeneity analyses. In particular, regression models are exhaustively evaluated to identify Pareto optimal designs that maximize performance for given power budgets. These models enable pipeline depth studies in which all parameters vary simultaneously with depth, thereby more effectively revealing interactions with nondepth parameters. Heterogeneity analysis combines regression-based optimization with clustering heuristics to identify efficient design compromises between similar optimal architectures. These compromises are potential core designs in a heterogeneous multicore architecture. Increasing heterogeneity can improve bips3/w efficiency by as much as 2.4×, a theoretical upper bound on heterogeneity benefits that neglects contention between shared resources as well as design complexity. Collectively these studies demonstrate regression models' ability to expose trends and identify optima in diverse design regions, motivating the application of such models in statistical inference for more effective use of modern simulator infrastructure.