A cost-effective load-balancing policy for tile-based, massive multi-core packet processors

Authors:
Enric Musoll
Affiliations:
ConSentry Networks, Milpitas, CA
Venue:
ACM Transactions on Embedded Computing Systems (TECS)
Year:
2010

Citing 30
Cited 2

Temperature-aware microarchitecture

Proceedings of the 30th annual international symposium on Computer architecture
Reducing power density through activity migration

Proceedings of the 2003 international symposium on Low power electronics and design
A Delay Model and Speculative Architecture for Pipelined Routers

HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Source-level IP packet bursts: causes and effects

Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement
Leakage Current: Moore's Law Meets Static Power

Computer
Principles and Practices of Interconnection Networks

Principles and Practices of Interconnection Networks
Low-Latency Virtual-Channel Routers for On-Chip Networks

Proceedings of the 31st annual international symposium on Computer architecture
Microarchitectural techniques for power gating of execution units

Proceedings of the 2004 international symposium on Low power electronics and design
Heat-and-run: leveraging SMT and CMP to manage power density through the operating system

ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Thermal-Aware Clustered Microarchitectures

ICCD '04 Proceedings of the IEEE International Conference on Computer Design
Performance Evaluation and Design Trade-Offs for Network-on-Chip Interconnect Architectures

IEEE Transactions on Computers
A scalable load balancer for forwarding internet traffic: exploiting flow-level burstiness

Proceedings of the 2005 ACM symposium on Architecture for networking and communications systems
Analytical Model for Sensor Placement on Microprocessors

ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Tile size selection for low-power tile-based architectures

Proceedings of the 3rd conference on Computing frontiers
Leakage Power Analysis and Reduction for Nanoscale Circuits

IEEE Micro
Techniques for Multicore Thermal Management: Classification and New Exploration

Proceedings of the 33rd annual international symposium on Computer Architecture
High-level power analysis for multi-core chips

CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Design space exploration for multicore architectures: a power/performance/thermal view

Proceedings of the 20th annual international conference on Supercomputing
Design tradeoffs for tiled CMP on-chip networks

Proceedings of the 20th annual international conference on Supercomputing
Interconnect design considerations for large NUCA caches

Proceedings of the 34th annual international symposium on Computer architecture
Microprocessors in the era of terascale integration

Proceedings of the conference on Design, automation and test in Europe
Temperature aware task scheduling in MPSoCs

Proceedings of the conference on Design, automation and test in Europe
Thermal-aware task scheduling at the system software level

ISLPED '07 Proceedings of the 2007 international symposium on Low power electronics and design
Thermal-aware scheduling for future chip multiprocessors

EURASIP Journal on Embedded Systems
A 5-GHz Mesh Interconnect for a Teraflops Processor

IEEE Micro
Flattened Butterfly Topology for On-Chip Networks

Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Energy scalability of on-chip interconnection networks

Energy scalability of on-chip interconnection networks
Many-core design from a thermal perspective

Proceedings of the 45th annual Design Automation Conference
A Thermal-Friendly Load-Balancing Technique for Multi-Core Processors

ISQED '08 Proceedings of the 9th international symposium on Quality Electronic Design
Low Power Methodology Manual: For System-on-Chip Design

Low Power Methodology Manual: For System-on-Chip Design

Variable-size mosaics: A process-variation aware technique to increase the performance of tile-based, massive multi-core processors

Computers and Electrical Engineering
High-performance optimizations on tiled many-core embedded systems: a matrix multiplication case study

The Journal of Supercomputing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Massive multi-core architectures provide a computation platform with high processing throughput, enabling the efficient processing of workloads with a significant degree of thread-level parallelism found in networking environments. Communication-centric workloads, like those in LAN and WAN environments, are fundamentally composed of sets of packets, named flows. The packets within a flow usually have dependencies among them, which reduce the amount of parallelism. However, packets of different flows tend to have very few or no dependencies among them, and thus can exploit thread-level parallelism to its fullest extent. Therefore, in massive tile-based multi-core architectures, it is important that the processing of the packets of a particular flow takes place in a set of cores physically close to each other to minimize the communication latency among those cores. Moreover, it is also desirable to spread out the processing of the different flows across all the cores of the processor in order to minimize the stress on a reduced number of cores, thus minimizing the potential for thermal hotspots and increasing the reliability of the processor. In addition, the burst-like nature of packet-based workloads render most of the cores idle most of the time, enabling large power savings by power gating these idle cores. This work presents a high-level study of the performance, power, and thermal behavior of tile-based architectures with a large number of cores executing flow-based packet workloads, and proposes a load-balancing policy of assigning packets to cores that minimizes the communication latency while featuring a hotspot-free thermal profile.