Load balancing packets on a tile-based massive multi-core processor with S-NUCA

Authors:
Enric Musoll
Affiliations:
ConSentry Networks, Inc.
Venue:
Proceedings of the 6th ACM/IEEE Symposium on Architectures for Networking and Communications Systems
Year:
2010

Citing 2
Cited 0

An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
A 5-GHz Mesh Interconnect for a Teraflops Processor

IEEE Micro

Quantified Score

Hi-index	0.00

Visualization

Abstract

In massive tile-based multi-core architectures, it is important that the execution of the packets of a particular flow takes place in a set of cores physically close to each other in order to minimize the average latency to the common data structures across the local caches of the different cores. An static NUCA implementation provides a substrate for a cost-effective implementation of a cache sharing mechanism. However, a careful mapping of the different data structures in the system's memory, along with a smart load-balancing mechanism of the packets to the different cores, is fundamental in order to avoid long latencies to remote data. This work proposes a methodology for load balancing packets to cores in an S-NUCA tile-based architecture with a large number of cores.