An FPGA-based VLIW processor with custom hardware execution
Proceedings of the 2005 ACM/SIGDA 13th international symposium on Field-programmable gate arrays
ICC'08 Proceedings of the 12th WSEAS international conference on Circuits
Hi-index | 0.00 |
In this paper, we investigate the benefits of a flexible,application-specific instruction set by adding a run-timeReconfigurable Functional Unit (RFU) to a VLIWprocessor. Preliminary results on the motion estimationstage in an MPEG4 video encoder are presented. Withthe RFU modeled at functional level and under realisticassumptions on execution latency, technology scaling andreconfiguration penalty, we explore different RFUinstructions at fine-grain (instruction-level) and coarse-grain(loop-level) granularity to speedup the applicationexecution. The memory bandwidth bottleneck, typical forstreaming applications, is alleviated through thecombined adoption of custom prefetch patterninstructions and an extent of local memory. Performanceevaluations indicate up to 8x improvement, with loop-leveloptimizations is achieved under variousarchitectural assumptions.