Efficient mapping of hierarchical trees on coarse-grain reconfigurable architectures
Proceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
Interactive Ray Tracing on Reconfigurable SIMD MorphoSys
DATE '03 Proceedings of the conference on Design, Automation and Test in Europe: Designers' Forum - Volume 2
Interactive ray tracing on reconfigurable SIMD MorphoSys
ASP-DAC '03 Proceedings of the 2003 Asia and South Pacific Design Automation Conference
A low-cost mixed-mode parallel processor architecture for embedded systems
Proceedings of the 21st annual international conference on Supercomputing
Power-Efficient Predication Techniques for Acceleration of Control Flow Execution on CGRA
ACM Transactions on Architecture and Code Optimization (TACO)
Compiling control-intensive loops for CGRAs with state-based full predication
Proceedings of the Conference on Design, Automation and Test in Europe
State-based full predication for low power coarse-grained reconfigurable architecture
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Hi-index | 0.00 |
This paper presents a novel method for improving the operation autonomy of the processing elements (PE) of SIMD-like machines. By combining guarded instructions and pseudo branches it is possible to achieve higher operation autonomy and higher instruction level parallelism than in previous SIMD/ASIMD architectures. The paper shows that it is feasible to avoid most branches and it is also possible to emulate conditional execution on the processing elements, either by using guarded instructions or by using pseudo branches, thus avoiding unnecessary intervention by the array control unit in data-dependant computations. Pseudo branches are used when it is not possible to use guarded instructions. Additionally, they alsosupport the implementation of complex nested if-then-else constructs, improving the execution of irregular data-parallel applications. The paper also shows that the simplicity of the method allows it to be implemented both in fine-grain and coarse-grain SIMD/ASIMD architectures because it does not require significant additional silicon area. Finally, it is shown that pseudo branches can be used to control the power saving of those processing elements that have instructions nullified.