MediaBench: a tool for evaluating and synthesizing multimedia and communicatons systems
MICRO 30 Proceedings of the 30th annual ACM/IEEE international symposium on Microarchitecture
Partitioning Processor Arrays under Resource Constraints
Journal of VLSI Signal Processing Systems
The Haskell: The Craft of Functional Programming
The Haskell: The Craft of Functional Programming
SPARK: A High-Lev l Synthesis Framework For Applying Parallelizing Compiler Transformations
VLSID '03 Proceedings of the 16th International Conference on VLSI Design
ASAP '04 Proceedings of the Application-Specific Systems, Architectures and Processors, 15th IEEE International Conference
A Design Methodology for Hardware Acceleration of Adaptive Filter Algorithms in Image Processing
ASAP '06 Proceedings of the IEEE 17th International Conference on Application-specific Systems, Architectures and Processors
Hierarchical Partitioning for Piecewise Linear Algorithms
PARELEC '06 Proceedings of the international symposium on Parallel Computing in Electrical Engineering
Efficient control generation for mapping nested loop programs onto processor arrays
Journal of Systems Architecture: the EUROMICRO Journal
Automatic FIR filter generation for FPGAs
SAMOS'05 Proceedings of the 5th international conference on Embedded Computer Systems: architectures, Modeling, and Simulation
Parallelization Approaches for Hardware Accelerators --- Loop Unrolling Versus Loop Partitioning
ARCS '09 Proceedings of the 22nd International Conference on Architecture of Computing Systems
ARCS '09 Proceedings of the 22nd International Conference on Architecture of Computing Systems
System integration of tightly-coupled processor arrays using reconfigurable buffer structures
Proceedings of the ACM International Conference on Computing Frontiers
The benefits of using variable-length pipelined operations in high-level synthesis
ACM Transactions on Embedded Computing Systems (TECS)
Journal of Real-Time Image Processing
Hi-index | 0.00 |
In this paper, we present the PARO design tool for the automated hardware synthesis of massively parallel embedded architectures for given dataflow dominant applications. Key features of PARO are: (1) The design entry in form of a compact and intuitive functional programming language which allows highly parallel implementations. (2) Advanced partitioning techniques are applied in order to balance the trade-offs in cost and performance along with requisite throughputs. This is obtained by distributing computations onto an array of tightly coupled processor elements. (3) We demonstrate the performance of the FPGA synthesized hardware with several selected algorithms from different benchmarks.