Communication and memory architecture design of application-specific high-end multiprocessors

Authors:
Yahya Jan;Lech Jóźwiak
Affiliations:
Faculty of Electrical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands;Faculty of Electrical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
Venue:
VLSI Design
Year:
2012

Citing 19
Cited 3

Quality-driven design in the system-on-a-chip era: why and how?

Journal of Systems Architecture: the EUROMICRO Journal - Modern methods and tools in digital system design
High-Level Synthesis of Nonprogrammable Hardware Accelerators

ASAP '00 Proceedings of the IEEE International Conference on Application-Specific Systems, Architectures, and Processors
SPARK: A High-Lev l Synthesis Framework For Applying Parallelizing Compiler Transformations

VLSID '03 Proceedings of the 16th International Conference on VLSI Design
Global approach to assignment and scheduling of complex behaviors based on HCDG and constraint programming

Journal of Systems Architecture: the EUROMICRO Journal - Special issue: Synthesis and verification
A Scalable Architecture for LDPC Decoding

Proceedings of the conference on Design, automation and test in Europe - Volume 3
Optimized Generation of Data-Path from C Codes for FPGAs

Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Cell Multiprocessor Communication Network: Built for Speed

IEEE Micro
On-chip communication architecture exploration: A quantitative evaluation of point-to-point, bus, and network-on-chip approaches

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Low-Power High-Level Synthesis for Nanoscale CMOS Circuits

Low-Power High-Level Synthesis for Nanoscale CMOS Circuits
High-throughput layered decoder implementation for quasi-cyclic LDPC codes

IEEE Journal on Selected Areas in Communications - Special issue on capaciyy approaching codes
Modern development methods and tools for embedded reconfigurable systems: A survey

Integration, the VLSI Journal
Massively LDPC Decoding on Multicore Architectures

IEEE Transactions on Parallel and Distributed Systems
LegUp: high-level synthesis for FPGA-based processor/accelerator systems

Proceedings of the 19th ACM/SIGDA international symposium on Field programmable gate arrays
GNLS: a hybrid on-chip communication architecture for SoC designs

International Journal of High Performance Systems Architecture
Communication on the Fly for Hierarchical Systems of Chip Multi-processors

PARELEC '11 Proceedings of the 2011 Sixth International Symposium on Parallel Computing in Electrical Engineering
Good error-correcting codes based on very sparse matrices

IEEE Transactions on Information Theory
FPGA Pipeline Synthesis Design Exploration Using Module Selection and Resource Sharing

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
High-Level Synthesis for FPGAs: From Prototyping to Deployment

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
A multi-processor NoC-based architecture for real-time image/video enhancement

Journal of Real-Time Image Processing

Design of massively parallel hardware multi-processors for highly-demanding embedded applications

Microprocessors & Microsystems
ASAM: Automatic architecture synthesis and application mapping

Microprocessors & Microsystems
Processor architecture exploration and synthesis of massively parallel multi-processor accelerators in application to LDPC decoding

Microprocessors & Microsystems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper is devoted to the design of communication and memory architectures of massively parallel hardware multiprocessors necessary for the implementation of highly demanding applications. We demonstrated that for the massively parallel hardware multiprocessors the traditionally used flat communication architectures and multi-port memories do not scale well, and the memory and communication network influence on both the throughput and circuit area dominates the processors influence. To resolve the problems and ensure scalability, we proposed to design highly optimized application-specific hierarchical and/or partitioned communication and memory architectures through exploring and exploiting the regularity and hierarchy of the actual data flows of a given application. Furthermore, we proposed some data distribution and related data mapping schemes in the shared (global) partitioned memories with the aim to eliminate the memory access conflicts, as well as, to ensure that our communication design strategies will be applicable. We incorporated these architecture synthesis strategies into our quality-driven model-based multi-processor design method and related automated architecture exploration framework. Using this framework, we performed a large series of experiments that demonstrate many various important features of the synthesized memory and communication architectures. They also demonstrate that our method and related framework are able to efficiently synthesize well scalable memory and communication architectures even for the high-end multiprocessors. The gains as high as 12-times in performance and 25-times in area can be obtained when using the hierarchical communication networks instead of the flat networks. However, for the high parallelism levels only the partitioned approach ensures the scalability in performance.