Using the SimOS machine simulator to study complex computer systems
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Fast out-of-order processor simulation using memoization
Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
Code transformations to improve memory parallelism
Proceedings of the 32nd annual ACM/IEEE international symposium on Microarchitecture
HLS: combining statistical and symbolic simulation to guide microprocessor designs
Proceedings of the 27th annual international symposium on Computer architecture
FLASH vs. (Simulated) FLASH: closing the simulation loop
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Measuring Experimental Error in Microprocessor Simulation
ISCA '01 Proceedings of the 28th annual international symposium on Computer architecture
Comparing and Combining Read Miss Clustering and Software Prefetching
Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques
The Impact of Instruction-Level Parallelism on Multiprocessor Performance and Simulation Methodology
HPCA '97 Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture
HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
Full-system timing-first simulation
SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Joint local and global hardware adaptations for energy
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
A Parallel-Object Programming Model for PetaFLOPS Machines and Blue Gene/Cyclops
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
The Use of Prediction for Accelerating Upgrade Misses in cc-NUMA Multiprocessors
Proceedings of the 2002 International Conference on Parallel Architectures and Compilation Techniques
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Owner prediction for accelerating cache-to-cache transfer misses in a cc-NUMA architecture
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Compactly representing parallel program executions
Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of parallel programming
Predictive dynamic thermal management for multimedia applications
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Packetization and routing analysis of on-chip multiprocessor networks
Journal of Systems Architecture: the EUROMICRO Journal - Special issue: Networks on chip
FIFO power optimization for on-chip networks
Proceedings of the 14th ACM Great Lakes symposium on VLSI
The energy efficiency of CMP vs. SMT for multimedia workloads
Proceedings of the 18th annual international conference on Supercomputing
Detailed cache coherence characterization for OpenMP benchmarks
Proceedings of the 18th annual international conference on Supercomputing
A Formal Approach to Frequent Energy Adaptations for Multimedia Applications
Proceedings of the 31st annual international symposium on Computer architecture
The Case for Lifetime Reliability-Aware Microprocessors
Proceedings of the 31st annual international symposium on Computer architecture
Packetized On-Chip Interconnect Communication Analysis for MPSoC
DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
A Two-Level Directory Architecture for Highly Scalable cc-NUMA Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Automatic Synthesis of High-Speed Processor Simulators
Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
A flexible simulation framework for graphics architectures
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware
MPARM: Exploring the Multi-Processor SoC Design Space with SystemC
Journal of VLSI Signal Processing Systems
Evaluating IA-32 web servers through simics: a practical experience
Journal of Systems Architecture: the EUROMICRO Journal
Communication Benchmarking and Performance Modelling of MPI Programs on Cluster Computers
The Journal of Supercomputing
A hybrid hardware/software approach to efficiently determine cache coherence Bottlenecks
Proceedings of the 19th annual international conference on Supercomputing
Evaluating the impact of the simulation environment on experimentation results
Performance Evaluation
A chip prototyping substrate: the flexible architecture for simulation and testing (FAST)
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Enhancing Performance of HW/SW Cosimulation and Coemulation by Reducing Communication Overhead
IEEE Transactions on Computers
Simulation of Computer Architectures: Simulators, Benchmarks, Methodologies, and Recommendations
IEEE Transactions on Computers
Unbounded Transactional Memory
IEEE Micro
International Journal of Parallel Programming
Improving Lookahead in Parallel Multiprocessor Simulation Using Dynamic Execution Path Prediction
Proceedings of the 20th Workshop on Principles of Advanced and Distributed Simulation
Achieving structural and composable modeling of complex systems
International Journal of Parallel Programming - Special issue: The next generation software program
Analysis of cache-coherence bottlenecks with hybrid hardware/software techniques
ACM Transactions on Architecture and Code Optimization (TACO)
ALP: Efficient support for all levels of parallelism for complex media applications
ACM Transactions on Architecture and Code Optimization (TACO)
Source-Code-Correlated Cache Coherence Characterization of OpenMP Benchmarks
IEEE Transactions on Parallel and Distributed Systems
Cross-component energy management: Joint adaptation of processor and memory
ACM Transactions on Architecture and Code Optimization (TACO)
The Journal of Supercomputing
Virtual Prototypes in Developing Mobile Software Applications and Devices
PROFES '08 Proceedings of the 9th international conference on Product-Focused Software Process Improvement
Journal of Parallel and Distributed Computing
A novel approach for hybrid performance modelling and prediction of large-scale computing systems
International Journal of Grid and Utility Computing
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
A two-level directory organization solution for CC-NUMA systems
ICA3PP'07 Proceedings of the 7th international conference on Algorithms and architectures for parallel processing
HiPC'07 Proceedings of the 14th international conference on High performance computing
Direct coherence: bringing together performance and scalability in shared-memory multiprocessors
HiPC'07 Proceedings of the 14th international conference on High performance computing
Design of a simulator for mesh-based reconfigurable architectures
NPC'07 Proceedings of the 2007 IFIP international conference on Network and parallel computing
Journal of Systems Architecture: the EUROMICRO Journal
EASE: an extensible architecture simulation engine
Proceedings of the 16th Western Canadian Conference on Computing Education
Design of multi-channel wireless NoC to improve on-chip communication capacity
NOCS '11 Proceedings of the Fifth ACM/IEEE International Symposium on Networks-on-Chip
PADS '10 Proceedings of the 2010 IEEE Workshop on Principles of Advanced and Distributed Simulation
Memory subsystem characterization in a 16-core snoop-based chip-multiprocessor architecture
HPCC'05 Proceedings of the First international conference on High Performance Computing and Communications
A novel lightweight directory architecture for scalable shared-memory multiprocessors
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Full system simulation of many-core heterogeneous SoCs using GPU and QEMU semihosting
Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units
Attackboard: a novel dependency-aware traffic generator for exploring NoC design space
Proceedings of the 49th Annual Design Automation Conference
Simsys: a performance simulation framework
Proceedings of the 2013 Workshop on Rapid Simulation and Performance Evaluation: Methods and Tools
A survey on cache tuning from a power/energy perspective
ACM Computing Surveys (CSUR)
International Journal of High Performance Computing Applications
An early memory hierarchy evaluation simulator for multimedia applications
Microprocessors & Microsystems
Hi-index | 4.11 |
The early 1990s saw several announcements of commercial shared-memory systems using processors that aggressively exploited instruction-level parallelism (ILP), including the MIPS R10000, Hewlett-Packard PA8000, and Intel Pentium Pro. These processors could potentially reduce memory read stalls by over-lapping read latency with other operations, possibly changing the nature of performance bottlenecks in the system.The authors' experience with Rsim demonstrates that modeling ILP features is important even in shared-memory multiprocessor systems. In particular, current simple processor-based approximations cannot model significant performance effects for applications exhibiting parallel read misses. Further, recent shared-memory designs such as aggressive implementations of sequential consistency use the aggressive ILP-enhancing features of modern processors that simple processor-based simulators do not model.As microprocessor systems become more complex, the availability of shared infrastructure source code is likely to become increasingly crucial. The authors plan to release a new Rsim version shortly that will include instruction caches, TLBs, multimedia extensions, simultaneous multithreading, Rabbit fast simulation mode, and ports to Linux platforms.