A study of 3D Network-on-Chip design for data parallel H.264 coding

Authors:
Thomas Canhao Xu;Alexander Wei Yin;Pasi Liljeberg;Hannu Tenhunen
Affiliations:
Department of Information Technology, University of Turku, Joukahaisenkatu 3-5B, Turku 20520, Finland and Turku Center for Computer Science, Joukahaisenkatu 3-5B, 6th Floor, Turku 20520, Finland;Department of Information Technology, University of Turku, Joukahaisenkatu 3-5B, Turku 20520, Finland and Turku Center for Computer Science, Joukahaisenkatu 3-5B, 6th Floor, Turku 20520, Finland;Department of Information Technology, University of Turku, Joukahaisenkatu 3-5B, Turku 20520, Finland and Turku Center for Computer Science, Joukahaisenkatu 3-5B, 6th Floor, Turku 20520, Finland;Department of Information Technology, University of Turku, Joukahaisenkatu 3-5B, Turku 20520, Finland and Turku Center for Computer Science, Joukahaisenkatu 3-5B, 6th Floor, Turku 20520, Finland
Venue:
Microprocessors & Microsystems
Year:
2011

Citing 24
Cited 1

Adaptive window flow control and learning algorithms for adaptive routing in data networks

SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Route packets, not wires: on-chip inteconnection networks

Proceedings of the 38th annual Design Automation Conference
The MPEG-4 Book

The MPEG-4 Book
An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches

Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Simics: A Full System Simulation Platform

Computer
Orion: a power-performance simulator for interconnection networks

Proceedings of the 35th annual ACM/IEEE international symposium on Microarchitecture
A large scale, homogeneous, fully distributed parallel machine, I

ISCA '77 Proceedings of the 4th annual symposium on Computer architecture
A Case Study in Networks-on-Chip Design for Embedded Video

Proceedings of the conference on Design, automation and test in Europe - Volume 2
Temperature-aware microarchitecture: Modeling and implementation

ACM Transactions on Architecture and Code Optimization (TACO)
Managing Wire Delay in Large Chip-Multiprocessor Caches

Proceedings of the 37th annual IEEE/ACM International Symposium on Microarchitecture
Niagara: A 32-Way Multithreaded Sparc Processor

IEEE Micro
A NUCA substrate for flexible CMP cache sharing

Proceedings of the 19th annual international conference on Supercomputing
Design and Management of 3D Chip Multiprocessors Using Network-in-Memory

Proceedings of the 33rd annual international symposium on Computer Architecture
Design space exploration and prototyping for on-chip multimedia applications

Proceedings of the 43rd annual Design Automation Conference
PicoServer: using 3D stacking technology to enable a compact energy efficient chip multiprocessor

Proceedings of the 12th international conference on Architectural support for programming languages and operating systems
3-D topologies for networks-on-chip

IEEE Transactions on Very Large Scale Integration (VLSI) Systems
MIRA: A Multi-layered On-Chip Interconnect Router Architecture

ISCA '08 Proceedings of the 35th Annual International Symposium on Computer Architecture
Energy-efficient MESI cache coherence with pro-active snoop filtering for multicore microprocessors

Proceedings of the 13th international symposium on Low power electronics and design
The PARSEC benchmark suite: characterization and architectural implications

Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Evaluation of data-parallel splitting approaches for H.264 decoding

Proceedings of the 6th International Conference on Advances in Mobile Computing and Multimedia
A scalable parallel H.264 decoder on the cell broadband engine architecture

CODES+ISSS '09 Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis
Architectural Exploration of Per-Core DVFS for Energy-Constrained On-Chip Networks

DSD '09 Proceedings of the 2009 12th Euromicro Conference on Digital System Design, Architectures, Methods and Tools
ORION 2.0: a fast and accurate NoC power and area model for early-stage design space exploration

Proceedings of the Conference on Design, Automation and Test in Europe
Redundant Slice Optimal Allocation for H.264 Multiple Description Coding

IEEE Transactions on Circuits and Systems for Video Technology

Providing multiple hard latency and throughput guarantees for packet switching networks on chip

Computers and Electrical Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we implement, analyze and compare different Network-on-Chip (NoC) architectures aiming at higher efficiencies for MPEG-4/H.264 coding. Two-dimensional (2D) and three-dimensional (3D) NoCs based on Non-Uniform Cache Access (NUCA) are analyzed. We present results using a full system simulator with realistic workloads. Experiments show the average network latencies in two 3D NoCs are reduced by 28% and 34% respectively, comparing with 2D design. It is also shown that heat dissipation is a trade-off in improving performance of 3D chips. Our analysis and experiment results provide a guideline to design efficient 3D NoCs for data parallel H.264 coding applications.