Distributed discrete-event simulation
ACM Computing Surveys (CSUR)
Parallel discrete event simulation
Communications of the ACM - Special issue on simulation
ATOM: a system for building customized program analysis tools
PLDI '94 Proceedings of the ACM SIGPLAN 1994 conference on Programming language design and implementation
The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Full-system timing-first simulation
SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Workload characterization of emerging computer applications
Automatically characterizing large scale program behavior
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Complete Computer System Simulation: The SimOS Approach
IEEE Parallel & Distributed Technology: Systems & Technology
SMARTS: accelerating microarchitecture simulation via rigorous statistical sampling
Proceedings of the 30th annual international symposium on Computer architecture
Characterizing and Comparing Prevailing Simulation Techniques
HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Pin: building customized program analysis tools with dynamic instrumentation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
QEMU, a fast and portable dynamic translator
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
The Strong correlation Between Code Signatures and Performance
ISPASS '05 Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2005
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Corona: System Implications of Emerging Nanophotonic Technology
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
An Adaptive Synchronization Technique for Parallel Simulation of Networked Clusters
ISPASS '08 Proceedings of the ISPASS 2008 - IEEE International Symposium on Performance Analysis of Systems and software
Disaggregated memory for expansion and sharing in blade servers
Proceedings of the 36th annual international symposium on Computer architecture
ACM SIGARCH Computer Architecture News
Instruction-level simulation of a cluster at scale
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Proceedings of the 20th symposium on Great lakes symposium on VLSI
Operating system support for NVM+DRAM hybrid main memory
HotOS'09 Proceedings of the 12th conference on Hot topics in operating systems
Simple but Effective Heterogeneous Main Memory with On-Chip Memory Controller Support
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Journal of Parallel and Distributed Computing
Optimizing the datacenter for data-centric workloads
Proceedings of the international conference on Supercomputing
PADS '10 Proceedings of the 2010 IEEE Workshop on Principles of Advanced and Distributed Simulation
Flow: A Stream Processing System Simulator
PADS '10 Proceedings of the 2010 IEEE Workshop on Principles of Advanced and Distributed Simulation
A latency simulator for many-core systems
Proceedings of the 44th Annual Simulation Symposium
VSim: Simulating multi-server setups at near native hardware speed
ACM Transactions on Architecture and Code Optimization (TACO) - HIPEAC Papers
Full system simulation of many-core heterogeneous SoCs using GPU and QEMU semihosting
Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units
A universal parallel front-end for execution driven microarchitecture simulation
Proceedings of the 2012 Workshop on Rapid Simulation and Performance Evaluation: Methods and Tools
Transformer: a functional-driven cycle-accurate multicore simulator
Proceedings of the 49th Annual Design Automation Conference
Fast architecture evaluation of heterogeneous MPSoCs by host-compiled simulation
Proceedings of the 15th International Workshop on Software and Compilers for Embedded Systems
Trace-driven simulation of memory system scheduling in multithread application
Proceedings of the 2012 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness
Simulating the future kilo-x86-64 core processors and their infrastructure
Proceedings of the 45th Annual Simulation Symposium
A flexible simulation framework for multicore schedulers: work in progress paper
Proceedings of the 2013 ACM SIGSIM conference on Principles of advanced discrete simulation
Reducing DRAM row activations with eager read/write clustering
ACM Transactions on Architecture and Code Optimization (TACO)
PCantorSim: Accelerating parallel architecture simulation through fractal-based sampling
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
Simulation has historically been the primary technique used for evaluating the performance of new proposals in computer architecture. Speed and complexity considerations have traditionally limited its applicability to single-thread processors running application-level code. This is no longer sufficient to model modern multicore systems running the complex workloads of commercial interest today. COTSon is a simulator framework jointly developed by HP Labs and AMD. The goal of COTSon is to provide fast and accurate evaluation of current and future computing systems, covering the full software stack and complete hardware models. It targets cluster-level systems composed of hundreds of commodity multicore nodes and their associated devices connected through a standard communication network. COTSon adopts a functional-directed philosophy, where fast functional emulators and timing models cooperate to improve the simulation accuracy at a speed sufficient to simulate the full stack of applications, middleware and OSs. This paper describes the changes in simulation philosophy we embraced in COTSon to address these new challenges. We base functional emulation on established, fast and validated tools that support commodity OSs and complex multitier applications. Through a robust interface between the functional and timing domain, we can leverage other existing simulators for individual sub-components, such as disks or networks. We abandon the idea of "always-on" cycle-based simulation in favor of statistical sampling approaches that can trade accuracy for speed. COTSon opens up a new dimension in the speed/accuracy space, allowing simulation of a cluster of nodes several orders of magnitude faster with a minimal accuracy loss.