Exploiting fine-grain thread parallelism on multicore architectures
Scientific Programming - Software Development for Multi-core Computing Systems
Task-Based execution of nested OpenMP loops
IWOMP'12 Proceedings of the 8th international conference on OpenMP in a Heterogeneous World
Hi-index | 0.00 |
We present the development of a novel high-performance face detection system using a neural network-based classification algorithm and an efficient parallelization with OpenMP. We discuss the design of the system in detail along with experimental assessment. Our parallelization strategy starts with one level of threads and moves to the exploitation of nested parallel regions in order to further improve, by up to 19%, the image-processing capability. The presented system is able to process images in real time (38 images-sec) by sustaining almost linear speedups on a system with a quad-core processor and a particular OpenMP runtime library. Copyright © 2009 John Wiley & Sons, Ltd.