A Model of Saliency-Based Visual Attention for Rapid Scene Analysis
IEEE Transactions on Pattern Analysis and Machine Intelligence
An Emulated Digital CNN Implementation
Journal of VLSI Signal Processing Systems - Special issue on spatiotemporal signal processing with analog CNN visual microprocessors
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
On-Road Vehicle Detection: A Review
IEEE Transactions on Pattern Analysis and Machine Intelligence
Low-power network-on-chip for high-performance SoC design
IEEE Transactions on Very Large Scale Integration (VLSI) Systems
CNN applications from the hardware point of view: video sequence segmentation: Research Articles
International Journal of Circuit Theory and Applications - CNN Technology
A Parallel Hardware Architecture for Scale and Rotation Invariant Feature Detection
IEEE Transactions on Circuits and Systems for Video Technology
Hi-index | 0.00 |
As object recognition requires huge computation power to deal with complex image processing tasks, it is very challenging to meet real-time processing demands under low-power constraints for embedded systems. In this paper, a configurable heterogeneous multicore architecture with a dual-mode linear processor array and a cellular neural network on the networkon-chip platform is presented for real-time object recognition. The bio-inspired attention-based object recognition algorithm is devised to reduce computational complexity of the object recognition. The cellular neural network is utilized to accelerate the visual attention algorithm for selecting salient image regions rapidly. The dual-mode parallel processor is configured into single instruction, multiple data (SIMD) or multiple-instructionmultiple-data modes to perform data-intensive image processing operations while exploiting pixel-level and feature-level parallelisms required for the attention-based object recognition. The algorithm's hybrid parallelization strategy on the proposed architecture is adopted to obtain maximum performance improvement. The performance analysis results, using a cycle-accurate architecture simulator, show that the proposed architecture achieves a speedup of 2.8 times for the target algorithm over conventional massively parallel SIMD architecture at low hardware cost overhead. A prototype chip of the proposed architecture, fabricated in 0.13 µm complementary metal-oxide-semiconductor technology, achieves 22 frames/s real-time object recognition with less than 600 mW power consumption.