Temperature-aware microarchitecture
Proceedings of the 30th annual international symposium on Computer architecture
Reducing power density through activity migration
Proceedings of the 2003 international symposium on Low power electronics and design
A Delay Model and Speculative Architecture for Pipelined Routers
HPCA '01 Proceedings of the 7th International Symposium on High-Performance Computer Architecture
Source-level IP packet bursts: causes and effects
Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement
Principles and Practices of Interconnection Networks
Principles and Practices of Interconnection Networks
Low-Latency Virtual-Channel Routers for On-Chip Networks
Proceedings of the 31st annual international symposium on Computer architecture
Microarchitectural techniques for power gating of execution units
Proceedings of the 2004 international symposium on Low power electronics and design
Heat-and-run: leveraging SMT and CMP to manage power density through the operating system
ASPLOS XI Proceedings of the 11th international conference on Architectural support for programming languages and operating systems
Thermal-Aware Clustered Microarchitectures
ICCD '04 Proceedings of the IEEE International Conference on Computer Design
Performance Evaluation and Design Trade-Offs for Network-on-Chip Interconnect Architectures
IEEE Transactions on Computers
A scalable load balancer for forwarding internet traffic: exploiting flow-level burstiness
Proceedings of the 2005 ACM symposium on Architecture for networking and communications systems
Analytical Model for Sensor Placement on Microprocessors
ICCD '05 Proceedings of the 2005 International Conference on Computer Design
Tile size selection for low-power tile-based architectures
Proceedings of the 3rd conference on Computing frontiers
Techniques for Multicore Thermal Management: Classification and New Exploration
Proceedings of the 33rd annual international symposium on Computer Architecture
High-level power analysis for multi-core chips
CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
Design space exploration for multicore architectures: a power/performance/thermal view
Proceedings of the 20th annual international conference on Supercomputing
Design tradeoffs for tiled CMP on-chip networks
Proceedings of the 20th annual international conference on Supercomputing
Interconnect design considerations for large NUCA caches
Proceedings of the 34th annual international symposium on Computer architecture
Microprocessors in the era of terascale integration
Proceedings of the conference on Design, automation and test in Europe
Temperature aware task scheduling in MPSoCs
Proceedings of the conference on Design, automation and test in Europe
Thermal-aware task scheduling at the system software level
ISLPED '07 Proceedings of the 2007 international symposium on Low power electronics and design
Thermal-aware scheduling for future chip multiprocessors
EURASIP Journal on Embedded Systems
Flattened Butterfly Topology for On-Chip Networks
Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture
Energy scalability of on-chip interconnection networks
Energy scalability of on-chip interconnection networks
Many-core design from a thermal perspective
Proceedings of the 45th annual Design Automation Conference
A Thermal-Friendly Load-Balancing Technique for Multi-Core Processors
ISQED '08 Proceedings of the 9th international symposium on Quality Electronic Design
Low Power Methodology Manual: For System-on-Chip Design
Low Power Methodology Manual: For System-on-Chip Design
Computers and Electrical Engineering
The Journal of Supercomputing
Hi-index | 0.00 |
Massive multi-core architectures provide a computation platform with high processing throughput, enabling the efficient processing of workloads with a significant degree of thread-level parallelism found in networking environments. Communication-centric workloads, like those in LAN and WAN environments, are fundamentally composed of sets of packets, named flows. The packets within a flow usually have dependencies among them, which reduce the amount of parallelism. However, packets of different flows tend to have very few or no dependencies among them, and thus can exploit thread-level parallelism to its fullest extent. Therefore, in massive tile-based multi-core architectures, it is important that the processing of the packets of a particular flow takes place in a set of cores physically close to each other to minimize the communication latency among those cores. Moreover, it is also desirable to spread out the processing of the different flows across all the cores of the processor in order to minimize the stress on a reduced number of cores, thus minimizing the potential for thermal hotspots and increasing the reliability of the processor. In addition, the burst-like nature of packet-based workloads render most of the cores idle most of the time, enabling large power savings by power gating these idle cores. This work presents a high-level study of the performance, power, and thermal behavior of tile-based architectures with a large number of cores executing flow-based packet workloads, and proposes a load-balancing policy of assigning packets to cores that minimizes the communication latency while featuring a hotspot-free thermal profile.