Hierarchical Partitioning for Piecewise Linear Algorithms

Authors:
Hritam Dutta;Frank Hannig;Jurgen Teich
Affiliations:
University of Erlangen-Nuremberg, Germany;University of Erlangen-Nuremberg, Germany;University of Erlangen-Nuremberg, Germany
Venue:
PARELEC '06 Proceedings of the international symposium on Parallel Computing in Electrical Engineering
Year:
2006

Citing 0
Cited 7

Efficient control generation for mapping nested loop programs onto processor arrays

Journal of Systems Architecture: the EUROMICRO Journal
PARO: Synthesis of Hardware Accelerators for Multi-dimensional Dataflow-Intensive Applications

ARC '08 Proceedings of the 4th international workshop on Reconfigurable Computing: Architectures, Tools and Applications
Automatic generation of a parallel tile processing unit for algorithms with non-affine array references

IFMT '08 Proceedings of the 1st international forum on Next-generation multicore/manycore technologies
A holistic approach for tightly coupled reconfigurable parallel processors

Microprocessors & Microsystems
Parallelization Approaches for Hardware Accelerators --- Loop Unrolling Versus Loop Partitioning

ARCS '09 Proceedings of the 22nd International Conference on Architecture of Computing Systems
Efficient Data Access Management for FPGA-Based Image Processing SoCs

RSP '09 Proceedings of the 2009 IEEE/IFIP International Symposium on Rapid System Prototyping
Exploration of 3D grid caching strategies for ray-shooting

Journal of Real-Time Image Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Processor arrays are used as accelerators for plenty of data flow-dominant applications. The explosive growth in research and development of massively parallel processor array architectures has lead to demand for mapping tools to realize the full potential of these architectures. Such architectures are characterized by hierarchies of parallelism and memory structures, i.e. processor array apart from different levels of cache arrays have a number of processing elements (PE) where each PE can further contain sub-word parallelism. In order to handle large scale problems, balance local memory requirements with I/O-bandwidth, and use different hierarchies of parallelism and memory, one needs a sophisticated transformation called hierarchical partitioning. In this paper, we introduce for the first time a detailed methodology encompassing hierarchical partitioning.