A software-based dynamic-warp scheduling approach for load-balancing the Viola-Jones face detection algorithm on GPUs

  • Authors:
  • Tan Nguyen;Daniel Hefenbrock;Jason Oberg;Ryan Kastner;Scott Baden

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • Journal of Parallel and Distributed Computing
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Face detection is a key component in applications such as security surveillance and human-computer interaction systems, and real-time recognition is essential in many scenarios. The Viola-Jones algorithm is an attractive means of meeting the real time requirement, and has been widely implemented on custom hardware, FPGAs and GPUs. We demonstrate a GPU implementation that achieves competitive performance, but with low development costs. Our solution treats the irregularity inherent to the algorithm using a novel dynamic warp scheduling approach that eliminates thread divergence. This new scheme also employs a thread pool mechanism, which significantly alleviates the cost of creating, switching, and terminating threads. Compared to static thread scheduling, our dynamic warp scheduling approach reduces the execution time by a factor of 3. To maximize detection throughput, we also run on multiple GPUs, realizing 95.6 FPS on 5 Fermi GPUs.