On thermal effects in deep sub-micron VLSI interconnects
Proceedings of the 36th annual ACM/IEEE Design Automation Conference
The M5 Simulator: Modeling Networked Systems
IEEE Micro
System level assessment of an optical NoC in an MPSoC platform
Proceedings of the conference on Design, automation and test in Europe
Corona: System Implications of Emerging Nanophotonic Technology
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
3D-Stacked Memory Architectures for Multi-core Processors
ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Photonic Networks-on-Chip for Future Generations of Chip Multiprocessors
IEEE Transactions on Computers
Design Exploration of Optical Interconnection Networks for Chip Multiprocessors
HOTI '08 Proceedings of the 2008 16th IEEE Symposium on High Performance Interconnects
The PARSEC benchmark suite: characterization and architectural implications
Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Firefly: illuminating future network-on-chip with nanophotonics
Proceedings of the 36th annual international symposium on Computer architecture
A high-performance low-power nanophotonic on-chip network
Proceedings of the 14th ACM/IEEE international symposium on Low power electronics and design
Exploring hybrid photonic networks-on-chip foremerging chip multiprocessors
CODES+ISSS '09 Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis
Light speed arbitration and flow control for nanophotonic interconnects
Proceedings of the 42nd Annual IEEE/ACM International Symposium on Microarchitecture
A communication characterisation of Splash-2 and Parsec
IISWC '09 Proceedings of the 2009 IEEE International Symposium on Workload Characterization (IISWC)
Asynchronous current mode serial communication
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Communications of the ACM
OPAL: a multi-layer hybrid photonic NoC for 3D ICs
Proceedings of the 16th Asia and South Pacific Design Automation Conference
An Optical Wavelength Switching Architecture for a High-Performance Low-Power Photonic NoC
WAINA '11 Proceedings of the 2011 IEEE Workshops of International Conference on Advanced Information Networking and Applications
A composite and scalable cache coherence protocol for large scale CMPs
Proceedings of the international conference on Supercomputing
NOCS '12 Proceedings of the 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip
ORION 2.0: A Power-Area Simulator for Interconnection Networks
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
A Simple On-Chip Optical Interconnection for Improving Performance of Coherency Traffic in CMPs
DSD '12 Proceedings of the 2012 15th Euromicro Conference on Digital System Design
Hi-index | 0.00 |
Nanophotonic is a promising solution for on-chip interconnection due to its intrinsic low-latency and especially low-power features, desirable especially in future chip multiprocessors (CMPs) for rich client devices. In this paper we address the co-design of the parameters of a hybrid on-chip network featuring a traditional 2D mesh and a simple photonic helper ring aimed to improve performance and reduce energy consumption. As all the CMP traffic cannot be sustained in the considered simple optical interconnection without saturating the available bandwidth, and thus inducing performance and energy degradations, we identify the subset of coherency messages that are most worth to be accelerated through the low-energy optical path. We investigate the management/arbitration strategies for the physically shared photonic path as they are crucial for reaching an effective exploitation of optical bandwidth according to their overhead and parallelism achieved in message transmission. Our results on multithreaded benchmarks, highlight that a careful selection of the most latency-critical messages to be routed on the photonic-path along with a Multiple-Writers-Single-Reader access scheme allows execution time and energy improvements up to 19% and 5%, respectively, for the 8-core setup and up to 16% and 13% for the 16-core configuration. Furthermore, we show that the most aggressive ring access schemes allow the adoption of a four times slower electronic NoC that trades the achieved average speedup margin to obtain 70% overall energy savings, which is extremely important in energy constrained devices.