Register allocation for software pipelined loops
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Effective compiler support for predicated execution using the hyperblock
MICRO 25 Proceedings of the 25th annual international symposium on Microarchitecture
Iterative modulo scheduling: an algorithm for software pipelining loops
MICRO 27 Proceedings of the 27th annual international symposium on Microarchitecture
Formalized methodology for data reuse exploration for low-power hierarchical memory mappings
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Computer arithmetic: algorithms and hardware designs
Computer arithmetic: algorithms and hardware designs
IEEE Transactions on Computers
A decade of reconfigurable computing: a visionary retrospective
Proceedings of the conference on Design, automation and test in Europe
Data and memory optimization techniques for embedded systems
ACM Transactions on Design Automation of Electronic Systems (TODAES)
Optimizing compilers for modern architectures: a dependence-based approach
Optimizing compilers for modern architectures: a dependence-based approach
Synthesis and Optimization of Digital Circuits
Synthesis and Optimization of Digital Circuits
Memory Issues in Embedded Systems-on-Chip: Optimizations and Exploration
Memory Issues in Embedded Systems-on-Chip: Optimizations and Exploration
Introduction to Algorithms
A Quantitative Analysis of Reconfigurable Coprocessors for Multimedia Applications
FCCM '98 Proceedings of the IEEE Symposium on FPGAs for Custom Computing Machines
Network Topology Exploration of Mesh-Based Coarse-Grain Reconfigurable Architectures
Proceedings of the conference on Design, automation and test in Europe - Volume 1
Design Methodology for a Tightly Coupled VLIW/Reconfigurable Matrix Architecture: A Case Study
Proceedings of the conference on Design, automation and test in Europe - Volume 2
Register Constrained Modulo Scheduling
IEEE Transactions on Parallel and Distributed Systems
Implementing an OFDM Receiver on the RaPiD Reconfigurable Architecture
IEEE Transactions on Computers
Architecture Exploration for a Reconfigurable Architecture Template
IEEE Design & Test
Register File Architecture Optimization in a Coarse-Grained Reconfigurable Architecture
FCCM '05 Proceedings of the 13th Annual IEEE Symposium on Field-Programmable Custom Computing Machines
Alleviating the Data Memory Bandwidth Bottleneck in Coarse-Grained Reconfigurable Arrays
ASAP '05 Proceedings of the 2005 IEEE International Conference on Application-Specific Systems, Architecture Processors
Partitioning Methodology for Heterogeneous Reconfigurable Functional Units
The Journal of Supercomputing
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Data-driven regular reconfigurable arrays: design space exploration and mapping
SAMOS'05 Proceedings of the 5th international conference on Embedded Computer Systems: architectures, Modeling, and Simulation
ACM Transactions on Embedded Computing Systems (TECS)
High-level modelling and exploration of coarse-grained re-configurable architectures
Proceedings of the conference on Design, automation and test in Europe
A design flow for architecture exploration and implementation of partially reconfigurable processors
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
Hi-index | 0.00 |
The efficiency of a coarse grained reconfigurable array architecture in terms of performance and hardware cost is hard to be determined. The large number of parameters that define an architecture instance and the mapping complexity makes the evaluation extremely difficult to accomplish without tool assistance. This paper investigates the four factors that are directly related with the efficiency of these architectures namely; the area, the clock frequency, the scheduling efficiency and performance. A unified exploration framework has been build for estimating the values of the 4 aforementioned factors for different architecture alternatives. The exploration framework consists of two parts: a) an existing retargetable compiler from which the mapping efficiency is estimated and b) from the parametric realization of the coarse grained reconfigurable array in hardware description language (VHDL). The latter is used for the estimation of the area and clock frequency of each architecture instance with the realization of the system in the 0.13¼m process of ASIC technology. Also, the experiments refer to different architecture instances in terms of the processing elements. interconnection network, the register files. size, their number of input output ports, and finally the available bandwidth. Totally 72 architecture scenarios have been studied revealing how each characteristic influences performance and area for efficiently make design decisions.