The YAGS branch prediction scheme
MICRO 31 Proceedings of the 31st annual ACM/IEEE international symposium on Microarchitecture
Automatically characterizing large scale program behavior
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
SPEComp: A New Benchmark Suite for Measuring Parallel Computer Performance
WOMPAT '01 Proceedings of the International Workshop on OpenMP Applications and Tools: OpenMP Shared Memory Parallel Programming
Variability in Architectural Simulations of Multi-Threaded Workloads
HPCA '03 Proceedings of the 9th International Symposium on High-Performance Computer Architecture
The Effects of Mispredicted-Path Execution on Branch Prediction Structures
PACT '96 Proceedings of the 1996 Conference on Parallel Architectures and Compilation Techniques
Redeeming IPC as a Performance Metric for Multithreaded Programs
Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques
Managing Wire Delay in Large Chip-Multiprocessor Caches
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Characterizing and Comparing Prevailing Simulation Techniques
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Efficiently Evaluating Speedup Using Sampled Processor Simulation
IEEE Computer Architecture Letters
POWER5 System microarchitecture
IBM Journal of Research and Development - POWER5 and packaging
Computer Organization and Design: The Hardware/Software Interface
Computer Organization and Design: The Hardware/Software Interface
Coherence Ordering for Ring-based Chip Multiprocessors
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Virtual hierarchies to support server consolidation
Proceedings of the 34th annual international symposium on Computer architecture
Characterization of Apache web server with Specweb2005
MEDEA '07 Proceedings of the 2007 workshop on MEmory performance: DEaling with Applications, systems and architecture
PAM: a novel performance/power aware meta-scheduler for multi-core systems
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
In-network coherence filtering: snoopy coherence without broadcasts
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Utilizing predictors for efficient thermal management in multiprocessor SoCs
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
NPC'07 Proceedings of the 2007 IFIP international conference on Network and parallel computing
On-Chip Network Evaluation Framework
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
The ZCache: Decoupling Ways and Associativity
MICRO '43 Proceedings of the 2010 43rd Annual IEEE/ACM International Symposium on Microarchitecture
Run-time energy management of manycore systems through reconfigurable interconnects
Proceedings of the 21st edition of the great lakes symposium on Great lakes symposium on VLSI
The impact of memory subsystem resource sharing on datacenter applications
Proceedings of the 38th annual international symposium on Computer architecture
Scalable power control for many-core architectures running multi-threaded applications
Proceedings of the 38th annual international symposium on Computer architecture
Capacity metric for chip heterogeneous multiprocessors
CODES+ISSS '11 Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
A systematic methodology to develop resilient cache coherence protocols
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
Towards the ideal on-chip fabric for 1-to-many and many-to-1 communication
Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
SRP: symbiotic resource partitioning of the memory hierarchy in CMPs
HiPEAC'10 Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
Power-aware performance increase via core/uncore reinforcement control for chip-multiprocessors
Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design
Measuring interference between live datacenter applications
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Empirical measurement of instruction level parallelism for four generations of ARM CPUs
Proceedings of the 2013 International Workshop on Programming Models and Applications for Multicores and Manycores
Paragon: QoS-aware scheduling for heterogeneous datacenters
Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
CPI2: CPU performance isolation for shared compute clusters
Proceedings of the 8th ACM European Conference on Computer Systems
Reuse-based online models for caches
Proceedings of the ACM SIGMETRICS/international conference on Measurement and modeling of computer systems
ZSim: fast and accurate microarchitectural simulation of thousand-core systems
Proceedings of the 40th Annual International Symposium on Computer Architecture
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
SMT-centric power-aware thread placement in chip multiprocessors
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
Jigsaw: scalable software-defined caches
PACT '13 Proceedings of the 22nd international conference on Parallel architectures and compilation techniques
USENIX ATC'13 Proceedings of the 2013 USENIX conference on Annual Technical Conference
Ubik: efficient cache sharing with strict qos for latency-critical workloads
Proceedings of the 19th international conference on Architectural support for programming languages and operating systems
PCantorSim: Accelerating parallel architecture simulation through fractal-based sampling
ACM Transactions on Architecture and Code Optimization (TACO)
QoS-Aware scheduling in heterogeneous datacenters with paragon
ACM Transactions on Computer Systems (TOCS)
Hi-index | 0.00 |
Many architectural simulation studies use instructions per cycle (IPC) to analyze performance. For multithreaded programs running on multiprocessor systems, however, IPC often inaccurately reflects performance and leads to incorrect or misleading conclusions. Work-related metrics, such as time per transaction, are the most accurate and reliable way to estimate multiprocessor workload performance.