Near-Optimal Microprocessor and Accelerators Codesign with Latency and Throughput Constraints
ACM Transactions on Architecture and Code Optimization (TACO)
Hi-index | 0.00 |
This paper presents the mapping of an object detection application for aerial image based vehicle detection on highways onto a configurable heterogeneous RISC/coprocessor architecture. An extended pipelined processing scheme exploits the coprocessor features for parallel task processing. An alternative latency minimized processing scheme removes data dependencies in the application and improves parallel task processing capabilities. The coprocessor is mapped onto a Xilinx Virtex-5 FPGA of a RISC/FPGA-based embedded system board. The RISC in combination with the configurable coprocessor, running at 100 MHz, is capable of processing either up to 28.7 Full HD frames per second or reducing of system latencies to less than 55 ms. Therefore, this approach can be used to map complex object detection applications with high demands on throughput and latency onto the architecture.