IEEE Transactions on Parallel and Distributed Systems
Principles and Practices of Interconnection Networks
Principles and Practices of Interconnection Networks
DyAD: smart routing for networks-on-chip
Proceedings of the 41st annual Design Automation Conference
DATE '03 Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
A low latency router supporting adaptivity for on-chip interconnects
Proceedings of the 42nd annual Design Automation Conference
Proceedings of the 43rd annual Design Automation Conference
Introducing the SuperGT network-on-chip: SuperGT QoS: more than just GT
Proceedings of the 44th annual Design Automation Conference
Thousand core chips: a technology perspective
Proceedings of the 44th annual Design Automation Conference
A practical approach of memory access parallelization to exploit multiple off-chip DDR memories
Proceedings of the 45th annual Design Automation Conference
A Low-Latency and Memory-Efficient On-chip Network
NOCS '10 Proceedings of the 2010 Fourth ACM/IEEE International Symposium on Networks-on-Chip
OE+IOE: a novel turn model based fault tolerant routing scheme for networks-on-chip
CODES/ISSS '10 Proceedings of the eighth IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Proceedings of the 16th Asia and South Pacific Design Automation Conference
A network congestion-aware memory subsystem for manycore
ACM Transactions on Embedded Computing Systems (TECS) - Special Section on Wireless Health Systems, On-Chip and Off-Chip Network Architectures
Providing multiple hard latency and throughput guarantees for packet switching networks on chip
Computers and Electrical Engineering
Hi-index | 0.00 |
Data-intensive functions on chip, e.g., codec, 3D graphics, pixel processing, etc. need to make best use of the increased bandwidth of multiple memories enabled by 3D die stacking via accessing multiple memories in parallel. Parallel memory accesses with originally in-order requirements necessitate reorder buffers to avoid deadlock. Reorder buffers are expensive in terms of area and power consumption. In addition, conventional reorder buffers suffer from a problem of low resource utilization. In our work, we present a novel idea, called in-network reorder buffer, to increase the utilization of reorder buffer resource. In our method, we move the reorder buffer resource and related functions from network entry/exit points to network routers. Thus, the in-network reorder buffers can be better utilized in two ways. First, they can be utilized by other packets without in-order requirements while there are no in-order packets. Second, even in-order packets can benefit from innetwork reorder buffers by enjoying more shares of reorder buffers than before. Such an increase in reorder buffer utilization enables NoC performance improvement while supporting the original inorder requirements. Experimental results with an industrial strength DTV SoC example show that the presented idea improves the total execution cycle by 16.9%.