A dynamically configurable coprocessor for convolutional neural networks
Proceedings of the 37th annual international symposium on Computer architecture
A programmable parallel accelerator for learning and classification
Proceedings of the 19th international conference on Parallel architectures and compilation techniques
Image and video processing on CUDA: state of the art and future directions
MACMESE'11 Proceedings of the 13th WSEAS international conference on Mathematical and computational methods in science and engineering
A Massively Parallel, Energy Efficient Programmable Accelerator for Learning and Classification
ACM Transactions on Architecture and Code Optimization (TACO)
GPGPU implementation of growing neural gas: Application to 3D scene reconstruction
Journal of Parallel and Distributed Computing
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
In this paper, we consider the problem of face detection under pose variations. Unlike other contributions, a focus of this work resides within efficient implementation utilizing the computational powers of modern graphics cards. The proposed system consists of a parallelized implementation of convolutional neural networks (CNNs) with a special emphasize on also parallelizing the detection process. Experimental validation in a smart conference room with 4 active ceiling-mounted cameras shows a dramatic speed-gain under real-life conditions.