Operation and data mapping for CGRAs with multi-bank memory

Authors:
Yongjoo Kim;Jongeun Lee;Aviral Shrivastava;Yunheung Paek
Affiliations:
Seoul National University, Seoul, South Korea;Ulsan National Institute of Science and Technology, Ulsan, South Korea;Arizona State University, Tempe, AZ, USA;Seoul National University, Seoul, South Korea
Venue:
Proceedings of the ACM SIGPLAN/SIGBED 2010 conference on Languages, compilers, and tools for embedded systems
Year:
2010

Citing 16
Cited 4

Dynamically-Allocated Multi-Queue Buffers for VLSI Communication Switches

IEEE Transactions on Computers
MorphoSys: case study of a reconfigurable computing system targeting multimedia applications

Proceedings of the 37th Annual Design Automation Conference
A compiler framework for mapping applications to a coarse-grained reconfigurable computer architecture

CASES '01 Proceedings of the 2001 international conference on Compilers, architecture, and synthesis for embedded systems
Compilation Approach for Coarse-Grained Reconfigurable Architectures

IEEE Design & Test
An algorithm for mapping loops onto coarse-grained reconfigurable architectures

Proceedings of the 2003 ACM SIGPLAN conference on Language, compiler, and tool for embedded systems
Area efficient layouts of binary trees in grids

Area efficient layouts of binary trees in grids
Resource Sharing and Pipelining in Coarse-Grained Reconfigurable Architecture for Domain-Specific Optimization

Proceedings of the conference on Design, Automation and Test in Europe - Volume 1
Alleviating the Data Memory Bandwidth Bottleneck in Coarse-Grained Reconfigurable Arrays

ASAP '05 Proceedings of the 2005 IEEE International Conference on Application-Specific Systems, Architecture Processors
A spatial mapping algorithm for heterogeneous coarse-grained reconfigurable architectures

Proceedings of the conference on Design, automation and test in Europe: Proceedings
Modulo graph embedding: mapping applications onto coarse-grained reconfigurable architectures

CASES '06 Proceedings of the 2006 international conference on Compilers, architecture and synthesis for embedded systems
SPKM: a novel graph drawing based algorithm for application mapping onto coarse-grained reconfigurable architectures

Proceedings of the 2008 Asia and South Pacific Design Automation Conference
A Coarse-Grained Array Accelerator for Software-Defined Radio Baseband Processing

IEEE Micro
Edge-centric modulo scheduling for coarse-grained reconfigurable architectures

Proceedings of the 17th international conference on Parallel architectures and compilation techniques
Resource aware mapping on coarse grained reconfigurable arrays

Microprocessors & Microsystems
Recurrence cycle aware modulo scheduling for coarse-grained reconfigurable architectures

Proceedings of the 2009 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Memory-Aware application mapping on coarse-grained reconfigurable arrays

HiPEAC'10 Proceedings of the 5th international conference on High Performance Embedded Architectures and Compilers

Memory access optimization in compilation for coarse-grained reconfigurable architectures

ACM Transactions on Design Automation of Electronic Systems (TODAES)
Vector class on limited local memory (LLM) multi-core processors

CASES '11 Proceedings of the 14th international conference on Compilers, architectures and synthesis for embedded systems
Exploiting both pipelining and data parallelism with SIMD reconfigurable architecture

ARC'12 Proceedings of the 8th international conference on Reconfigurable Computing: architectures, tools and applications
Configurable range memory for effective data reuse on programmable accelerators

ACM Transactions on Design Automation of Electronic Systems (TODAES)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Coarse Grain Reconfigurable Architectures (CGRAs) promise high performance at high power efficiency. They fulfil this promise by keeping the hardware extremely simple, and moving the complexity to application mapping. One major challenge comes in the form of data mapping. For reasons of power-efficiency and complexity, CGRAs use multi-bank local memory, and a row of PEs share memory access. In order for each row of the PEs to access any memory bank, there is a hardware arbiter between the memory requests generated by the PEs and the banks of the local memory. However, a fundamental restriction remains that a bank cannot be accessed by two different PEs at the same time. We propose to meet this challenge by mapping application operations onto PEs and data into memory banks in a way that avoids such conflicts. Our experimental results on kernels from multimedia benchmarks demonstrate that our local memory-aware compilation approach can generate mappings that are up to 40% better in performance (17.3% on average) compared to a memory-unaware scheduler.