Parallel simulation of chip-multiprocessor architectures
ACM Transactions on Modeling and Computer Simulation (TOMACS)
SPLASH: Stanford parallel applications for shared-memory*
SPLASH: Stanford parallel applications for shared-memory*
Principles and Practices of Interconnection Networks
Principles and Practices of Interconnection Networks
VPC3: a fast and effective trace-compression algorithm
Proceedings of the joint international conference on Measurement and modeling of computer systems
A Complete Network-On-Chip Emulation Framework
Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
Stream-Based Trace Compression
IEEE Computer Architecture Letters
Implementation analysis of NoC: a MPSoC trace-driven approach
GLSVLSI '06 Proceedings of the 16th ACM Great Lakes symposium on VLSI
A Statistical Traffic Model for On-Chip Interconnection Networks
MASCOTS '06 Proceedings of the 14th IEEE International Symposium on Modeling, Analysis, and Simulation
An efficient single-pass trace compression technique utilizing instruction streams
ACM Transactions on Modeling and Computer Simulation (TOMACS)
Fast, Accurate and Detailed NoC Simulations
NOCS '07 Proceedings of the First International Symposium on Networks-on-Chip
The FAST methodology for high-speed SoC/computer simulation
Proceedings of the 2007 IEEE/ACM international conference on Computer-aided design
A-Ports: an efficient abstraction for cycle-accurate performance models on FPGAs
Proceedings of the 16th international ACM/SIGDA symposium on Field programmable gate arrays
Prediction and trace compression of data access addresses through nested loop recognition
Proceedings of the 6th annual IEEE/ACM international symposium on Code generation and optimization
Computers and Electrical Engineering
Exploiting temporal decoupling to accelerate trace-driven NoC emulation
CODES+ISSS '11 Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Hi-index | 0.00 |
We present an efficient emulation-based technique to accelerate architecture exploration of networks-on-chip (NoCs). The large design space of NoC along with its growing complexity that results in low simulation speeds on host machines have motivated the need for hardware accelerators for speeding up the simulation. For example, simulation of applications with real life problem sizes could take weeks on a host machine. FPGA acceleration is a promising strategy for speeding up NoC simulations by several orders of magnitude. However, it is required to simulate a few billion network transactions of the application during NoC exploration, and this could still take tens of minutes even with an FPGA-based emulator. With the increasing complexity of architectures and applications, reducing emulation time is a key concern. We propose a technique, FastFwd, to minimize emulation time by efficiently identifying and eliminating redundant cycles during a trace-based NoC simulation. We have studied the implications of the additional FPGA hardware required for implementing our technique. A naïve implementation could lead to poor scalability and increase the required DRAM bandwidth, both of which ultimately impact the emulation speed negatively. We propose a hierarchical controller architecture to resolve the scalability issue, and a compressed representation of traces for mitigating the increased DRAM bandwidth requirement. Our experiments with several benchmarks have shown that the FPGA emulation with our technique reduces the average emulation time by a factor of 2 when compared to a conventional emulation.