Neural Network-Based Face Detection
IEEE Transactions on Pattern Analysis and Machine Intelligence
Detecting Faces in Images: A Survey
IEEE Transactions on Pattern Analysis and Machine Intelligence
Robust Real-Time Face Detection
International Journal of Computer Vision
A Parallel Architecture for Hardware Face Detection
ISVLSI '06 Proceedings of the IEEE Computer Society Annual Symposium on Emerging VLSI Technologies and Architectures
Fpga-based face detection system using Haar classifiers
Proceedings of the ACM/SIGDA international symposium on Field programmable gate arrays
High-level synthesis: productivity, performance, and software constraints
Journal of Electrical and Computer Engineering - Special issue on ESL Design Methodology
Efficient compilation of CUDA kernels for high-performance computing on FPGAs
ACM Transactions on Embedded Computing Systems (TECS) - Special issue on application-specific processors
A hardware architecture for real-time object detection using depth and edge information
ACM Transactions on Embedded Computing Systems (TECS)
Hi-index | 0.00 |
Face detection is the cornerstone of a wide range of applications such as video surveillance, robotic vision and biometric authentication. One of the biggest challenges in face detection based applications is the speed at which faces can be accurately detected. In this paper, we present a novel SoC (System on Chip) architecture for ultra fast face detection in video or other image rich content. Our implementation is based on an efficient and robust algorithm that uses a cascade of Artificial Neural Network (ANN) classifiers on AdaBoost trained Haar features. The face detector architecture extracts the coarse grained parallelism by efficiently overlapping different computation phases while taking advantage of the finegrained parallelism at the module level. We provide details on the parallelism extraction achieved by our architecture and show experimental results that portray the efficiency of our face detection implementation. For the implementation and evaluation of our architecture we used the Xilinx FX130T Virtex5 FPGA device on the ML510 development board. Our performance evaluations indicate that a speedup of around 100X can be achieved over a SSE-optimized software implementation running on a 2.4GHz Core-2 Quad CPU. The detection speed reaches 625 frames per sec (fps).