Exploiting address compression and heterogeneous interconnects for efficient message management in tiled CMPs

Authors:
Antonio Flores;Manuel E. Acacio;Juan L. Aragón
Affiliations:
Departamento de Ingeniería y Tecnología de Computadores, University of Murcia, 30100 Murcia, Spain;Departamento de Ingeniería y Tecnología de Computadores, University of Murcia, 30100 Murcia, Spain;Departamento de Ingeniería y Tecnología de Computadores, University of Murcia, 30100 Murcia, Spain
Venue:
Journal of Systems Architecture: the EUROMICRO Journal
Year:
2010

Citing 25
Cited 0

Dynamic base register caching: a technique for reducing address bus width

ISCA '91 Proceedings of the 18th annual international symposium on Computer architecture
The SPLASH-2 programs: characterization and methodological considerations

ISCA '95 Proceedings of the 22nd annual international symposium on Computer architecture
The predictability of data values

MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Wattch: a framework for architectural-level power analysis and optimizations

Proceedings of the 27th annual international symposium on Computer architecture
Reducing wire delay penalty through value prediction

Proceedings of the 33rd annual ACM/IEEE international symposium on Microarchitecture
An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
RSIM: Simulating Shared-Memory Multiprocessors with ILP Processors

Computer
Orion: a power-performance simulator for interconnection networks

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
Power protocol: reducing power dissipation on off-chip data buses

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
TLC: Transmission Line Caches

Proceedings of the 36th annual IEEE/ACM International Symposium on Microarchitecture
Interconnect-power dissipation in a microprocessor

Proceedings of the 2004 international workshop on System level interconnect prediction
Evaluation of the Raw Microprocessor: An Exposed-Wire-Delay Architecture for ILP and Streams

Proceedings of the 31st annual international symposium on Computer architecture
Microarchitectural Wire Management for Performance and Power in Partitioned Architectures

HPCA '05 Proceedings of the 11th International Symposium on High-Performance Computer Architecture
Victim Replication: Maximizing Capacity while Hiding Wire Delay in Tiled Chip Multiprocessors

Proceedings of the 32nd annual international symposium on Computer Architecture
Interconnections in Multi-Core Architectures: Understanding Mechanisms, Overheads and Scaling

Proceedings of the 32nd annual international symposium on Computer Architecture
Exploiting Low Entropy to Reduce Wire Delay

IEEE Computer Architecture Letters
Interconnect-Aware Coherence Protocols for Chip Multiprocessors

Proceedings of the 33rd annual international symposium on Computer Architecture
Optimizing bus energy consumption of on-chip multiprocessors using frequent values

Journal of Systems Architecture: the EUROMICRO Journal - Special issue: Parallel, distributed and network-based processing
On-Chip Interconnection Architecture of the Tile Processor

IEEE Micro
An energy consumption characterization of on-chip interconnection networks for tiled CMP architectures

The Journal of Supercomputing
Address Compression and Heterogeneous Interconnects for Energy-Efficient High-Performance in Tiled CMPs

ICPP '08 Proceedings of the 2008 37th International Conference on Parallel Processing
Frequent value compression in packet-based NoC architectures

Proceedings of the 2009 Asia and South Pacific Design Automation Conference
Adaptive data compression for high-performance low-power on-chip networks

Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture
Efficient message management in tiled CMP architectures using a heterogeneous interconnection network

HiPC'07 Proceedings of the 14th international conference on High performance computing
ORION 2.0: a fast and accurate NoC power and area model for early-stage design space exploration

Proceedings of the Conference on Design, Automation and Test in Europe

Quantified Score

Hi-index	0.00

Visualization

Abstract

High performance processor designs have evolved toward architectures that integrate multiple processing cores on the same chip. As the number of cores inside a Chip MultiProcessor (CMP) increases, the interconnection network will have significant impact on both overall performance and energy consumption as previous studies have shown. Moreover, wires used in such interconnect can be designed with varying latency, bandwidth and power characteristics. In this work, we show how messages can be efficiently managed in tiled CMP, from the point of view of both performance and energy, by combining both address compression with a heterogeneous interconnect. In particular, our proposal is based on applying an address compression scheme that dynamically compresses the addresses within coherence messages allowing for a significant area slack. The arising area is exploited for wire latency improvement by using a heterogeneous interconnection network comprised of a small set of very-low-latency wires for critical short-messages in addition to baseline wires. Detailed simulations of a 16-core CMP show that our proposal obtains average improvements of 10% in execution time and 38% in the energy-delay^2 product of the interconnect. Additionally, the sensitivity analysis shows that our proposal performs well when either OoO cores or caches with higher latencies are considered.