Scheduled Dataflow: Execution Paradigm, Architecture, and Performance Evaluation
IEEE Transactions on Computers - Special issue on the parallel architecture and compilation techniques conference
Full-system timing-first simulation
SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Trace Factory: Generating Workloads for Trace-Driven Simulation of Shared-Bus Multiprocessors
IEEE Parallel & Distributed Technology: Systems & Technology
Pin: building customized program analysis tools with dynamic instrumentation
Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation
Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset
ACM SIGARCH Computer Architecture News - Special issue: dasCMP'05
QEMU, a fast and portable dynamic translator
ATEC '05 Proceedings of the annual conference on USENIX Annual Technical Conference
COTSon: infrastructure for full system simulation
ACM SIGOPS Operating Systems Review
ACM SIGARCH Computer Architecture News
Implementing Fine/Medium Grained TLP Support in a Many-Core Architecture
SAMOS '09 Proceedings of the 9th International Workshop on Embedded Computer Systems: Architectures, Modeling, and Simulation
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
Scalable, accurate multicore simulation in the 1000-core era
ISPASS '11 Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software
TERAFLUX: exploiting dataflow parallelism in teradevices
Proceedings of the 9th conference on Computing Frontiers
Hi-index | 0.00 |
The continuous improvements offered by the silicon technology enables the integration of always increasing number of cores on a single chip. Following this trend, it is expected to approach microprocessor architectures composed of thousands of cores (i.e., kilo-core architectures) in the next future. To cope with the increasing demand for high performance systems, many-core designs rely on integrated network-on-chips to deliver the correct bandwidth and latency for the inter-core communications. In this context, simulation tools represent a crucial factor for designing architectures at such scale of integration. The efficient simulation of the interconnection network along with the overall architecture (i.e., cores, cache memories, accelerators, etc.) still represents a complete open issue. This paper proposes a framework based on the COTSon simulator, able of scaling towards heterogeneous kilo-core architectures. Compared with current state-of-the-art architectural simulators, our framework provides not only a full-system architectural simulator, but a full-integrated accurate network-on-chip simulator. The framework shows a well balanced trade-off between simulation speed and accuracy, supporting the power consumption estimation. Experimental results demonstrate the ability of our framework to correctly simulate a large many-core machine and its interconnection network, considering different traffic patterns.