Accurate and efficient regression modeling for microarchitectural performance and power prediction
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
Efficiently exploring architectural design spaces via predictive modeling
Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
A Predictive Performance Model for Superscalar Processors
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Regression Modeling Strategies
Regression Modeling Strategies
An Efficient, Practical Parallelization Methodology for Multicore Architecture Simulation
IEEE Computer Architecture Letters
Using PredictiveModeling for Cross-Program Design Space Exploration in Multicore Systems
PACT '07 Proceedings of the 16th International Conference on Parallel Architecture and Compilation Techniques
Illustrative Design Space Studies with Microarchitectural Regression Models
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Microarchitectural Design Space Exploration Using an Architecture-Centric Approach
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Applied inference: Case studies in microarchitectural design
ACM Transactions on Architecture and Code Optimization (TACO)
A statistical performance model of the opteron processor
ACM SIGMETRICS Performance Evaluation Review - Special issue on the 1st international workshop on performance modeling, benchmarking and simulation of high performance computing systems (PMBS 10)
GROPHECY: GPU performance projection from CPU code skeletons
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Sniper: exploring the level of abstraction for scalable and accurate parallel multi-core simulation
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Compiler-Directed performance model construction for parallel programs
ARCS'10 Proceedings of the 23rd international conference on Architecture of Computing Systems
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Achieving application-centric performance targets via consolidation on multicores: myth or reality?
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
Power-aware multi-core simulation for early design stage hardware/software co-optimization
Proceedings of the 21st international conference on Parallel architectures and compilation techniques
Understanding fundamental design choices in single-ISA heterogeneous multicore architectures
ACM Transactions on Architecture and Code Optimization (TACO) - Special Issue on High-Performance Embedded Architectures and Compilers
Microarchitectural design space exploration made fast
Microprocessors & Microsystems
Inferred Models for Dynamic and Sparse Hardware-Software Spaces
MICRO-45 Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture
Flicker: a dynamically adaptive architecture for power limited multicore systems
Proceedings of the 40th Annual International Symposium on Computer Architecture
USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference
ACM Transactions on Intelligent Systems and Technology (TIST) - Special Section on Intelligent Mobile Knowledge Discovery and Management Systems and Special Issue on Social Web Mining
Hi-index | 0.00 |
Uniprocessor simulators track resource utilization cycle by cycle to estimate performance. Multiprocessor simulators, however, must account for synchronization events that increase the cost of every cycle simulated and shared resource contention that increases the total number of cycles simulated. These effects cause multiprocessor simulation times to scale superlinearly with the number of cores. Composable performance regression (CPR) fundamentally addresses these intractable multiprocessor simulation times, estimating multiprocessor performance with a combination of uniprocessor, contention, and penalty models. The uniprocessor model predicts baseline performance of each core while the contention models predict interfering accesses from other cores. Uniprocessor and contention model outputs are composed by a penalty model to produce the final multiprocessor performance estimate. Trained with a production quality simulator, CPR is accurate with median errors of 6.63, 4.83 percent for dual-, quad-core multiprocessors. Furthermore, composable regression is scalable, requiring 0.33脳 the simulations required by prior regression strategies.