On the Design of Cascades of Boosted Ensembles for Face Detection

  • Authors:
  • S. Charles Brubaker;Jianxin Wu;Jie Sun;Matthew D. Mullin;James M. Rehg

  • Affiliations:
  • School of Interactive Computing, Georgia Institute of Technology, Atlanta, USA 30332-0760;School of Interactive Computing, Georgia Institute of Technology, Atlanta, USA 30332-0760;School of Interactive Computing, Georgia Institute of Technology, Atlanta, USA 30332-0760;School of Interactive Computing, Georgia Institute of Technology, Atlanta, USA 30332-0760;School of Interactive Computing, Georgia Institute of Technology, Atlanta, USA 30332-0760

  • Venue:
  • International Journal of Computer Vision
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Cascades of boosted ensembles have become popular in the object detection community following their highly successful introduction in the face detector of Viola and Jones. Since then, researchers have sought to improve upon the original approach by incorporating new methods along a variety of axes (e.g. alternative boosting methods, feature sets, etc.). Nevertheless, key decisions about how many hypotheses to include in an ensemble and the appropriate balance of detection and false positive rates in the individual stages are often made by user intervention or by an automatic method that produces unnecessarily slow detectors. We propose a novel method for making these decisions, which exploits the shape of the stage ROC curves in ways that have been previously ignored. The result is a detector that is significantly faster than the one produced by the standard automatic method. When this algorithm is combined with a recycling method for reusing the outputs of early stages in later ones and with a retracing method that inserts new early rejection points in the cascade, the detection speed matches that of the best hand-crafted detector. We also exploit joint distributions over several features in weak learning to improve overall detector accuracy, and explore ways to improve training time by aggressively filtering features.