The SPLASH-2 programs: characterization and methodological considerations
ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
Flow control and micro-architectural mechanisms for extending the performance of interconnection networks
Low-Latency Virtual-Channel Routers for On-Chip Networks
Proceedings of the 31st annual international symposium on Computer architecture
The design and implementation of a low-latency on-chip network
ASP-DAC '06 Proceedings of the 2006 Asia and South Pacific Design Automation Conference
ViChaR: A Dynamic Virtual Channel Regulator for Network-on-Chip Routers
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture
Flattened butterfly: a cost-efficient topology for high-radix networks
Proceedings of the 34th annual international symposium on Computer architecture
Express virtual channels: towards the ideal interconnection fabric
Proceedings of the 34th annual international symposium on Computer architecture
Approaching Ideal NoC Latency with Pre-Configured Routes
NOCS '07 Proceedings of the First International Symposium on Networks-on-Chip
A Hybrid Ring/Mesh Interconnect for Network-on-Chip Using Hierarchical Rings for Global Routing
NOCS '07 Proceedings of the First International Symposium on Networks-on-Chip
Design of a Dynamic Priority-Based Fast Path Architecture for On-Chip Interconnects
HOTI '07 Proceedings of the 15th Annual IEEE Symposium on High-Performance Interconnects
A Scalable, Non-blocking Approach to Transactional Memory
HPCA '07 Proceedings of the 2007 IEEE 13th International Symposium on High Performance Computer Architecture
Hi-index | 0.00 |
With increasing number of cores, the communication latency of Network-on-Chip becomes a dominant problem due to complex operations per node. In this paper, we try to reduce communication latency by proposing single-cycle router architecture with wing channel, which forwards the incoming packets to free ports immediately with the inspection of switch allocation results. Also, the incoming packets granted with wing channel can fill in the time-slots of crossbar switch and reduce the contentions with subsequent ones, thereby pushing throughput effectively. We design the proposed router using 65nm CMOS process, and the results show that it supports different routing schemes and outperforms express virtual channel, prediction and Kumar's single-cycle ones in terms of latency and throughput. When compared to the speculative router, it provides 45.7% latency reduction and 14.0% throughput improvement. Moreover, we show that the proposed design incurs a modest area overhead of 8.1% but the power consumption is saved by 7.8% due to less arbitration activities.