Theory of linear and integer programming
Theory of linear and integer programming
Control generation in the design of processor arrays
Journal of VLSI Signal Processing Systems - Parallel processing on VLSI arrays
Field-programmable gate arrays: reconfigurable logic for rapid prototyping and implementation of digital systems
Partitioning Processor Arrays under Resource Constraints
Journal of VLSI Signal Processing Systems
Generation of Efficient Nested Loops from Polyhedra
International Journal of Parallel Programming - Special issue on instruction-level parallelism and parallelizing compilation, part 2
Constructing and exploiting linear schedules with prescribed parallelism
ACM Transactions on Design Automation of Electronic Systems (TODAES)
High Performance Compilers for Parallel Computing
High Performance Compilers for Parallel Computing
Design Space Exploration for Massively Parallel Processor Arrays
PaCT '01 Proceedings of the 6th International Conference on Parallel Computing Technologies
Exact Partitioning of Affine Dependence Algorithms
Embedded Processor Design Challenges: Systems, Architectures, Modeling, and Simulation - SAMOS
Loop Parallelization in the Polytope Model
CONCUR '93 Proceedings of the 4th International Conference on Concurrency Theory
Efficient code generation for automatic parallelization and optimization
ISPDC'03 Proceedings of the Second international conference on Parallel and distributed computing
Hierarchical algorithm partitioning at system level for an improved utilization of memory structures
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Efficient control generation for mapping nested loop programs onto processor arrays
Journal of Systems Architecture: the EUROMICRO Journal
A holistic approach for tightly coupled reconfigurable parallel processors
Microprocessors & Microsystems
Hi-index | 0.00 |
Processor arrays can be used as accelerators for a plenty of dataflow-dominant applications. Innately these applications have almost no control flow, but the application of sophisticated partitioning and scheduling techniques in order to handle large scale problems and to balance local memory requirements with I/O-bandwidth has the disadvantage of a more complex control flow. Thus, efficient control path synthesis is one of the greatest challenges when compiling algorithms onto processor arrays. This paper presents an efficient methodology for the automated control path synthesis for the mapping of partitioned algorithms onto processor arrays. The major advantages observed in the presented methodology are seen in, (a) control generation for different partitioning techniques and arbitrary parallelepiped tiles, (b) combined use of a global and a local control strategy in order to reduce the control overhead, (c) up to 90 percent reduction in control path area and resources compared to existing approaches.